Pandas Mongo
Transcript of Pandas Mongo
Pandas MongoRelease 0.1.0
May 05, 2020
Contents
1 Overview 1
2 Quick Start 3
3 Installation 53.1 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Installation 7
5 Quick Start 9
6 Reading dataframes from MongoDB using aggregation 11
7 Reference 137.1 pdmongo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
8 Contributing 158.1 Bug reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.2 Documentation improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.3 Feature requests and feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.4 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9 Authors 17
10 Changelog 1910.1 0.1.0 (2020-05-05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1910.2 0.0.2 (2020-05-04) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1910.3 0.0.1 (2020-04-30) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1910.4 0.0.0 (2020-03-22) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
11 Indices and tables 21
Python Module Index 23
Index 25
i
ii
CHAPTER 1
Overview
docstests
package
This package allows you to read/write pandas dataframes in MongoDB in the simplest way possible.
• Free software: MIT license
1
Pandas Mongo, Release 0.1.0
2 Chapter 1. Overview
CHAPTER 2
Quick Start
Writing a pandas DataFrame to a MongoDB collection:
import pdmongo as pdmimport pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})df = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb")df.to_mongo(df, collection, uri)
Reading a MongoDB collection into a pandas DataFrame:
import pdmongo as pdmdf = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb")print(df)
3
Pandas Mongo, Release 0.1.0
4 Chapter 2. Quick Start
CHAPTER 3
Installation
pip install pdmongo
You can also install the in-development version with:
pip install https://github.com/pakallis/python-pandas-mongo/archive/master.zip
3.1 Documentation
https://python-pandas-mongo.readthedocs.io/
3.2 Development
To run the all tests run:
tox
Note, to combine the coverage data from all the tox environments run:
Windowsset PYTEST_ADDOPTS=--cov-appendtox
OtherPYTEST_ADDOPTS=--cov-append tox
5
Pandas Mongo, Release 0.1.0
6 Chapter 3. Installation
CHAPTER 4
Installation
At the command line:
pip install pdmongo
7
Pandas Mongo, Release 0.1.0
8 Chapter 4. Installation
CHAPTER 5
Quick Start
Writing a pandas DataFrame to a MongoDB collection:
import pdmongo as pdmimport pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})df = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb")df.to_mongo(df, collection, uri)
Reading a MongoDB collection into a pandas DataFrame:
import pdmongo as pdmdf = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb")print(df)
9
Pandas Mongo, Release 0.1.0
10 Chapter 5. Quick Start
CHAPTER 6
Reading dataframes from MongoDB using aggregation
You can use an aggregation query to filter/transform data in MongoDB before fetching them into a data frame.
Reading a collection from MongoDB into a pandas DataFrame by using an aggregation query:
import pdmongo as pdmquery = [
{"$match": {
'A': 1}
}]df = pdm.read_mongo("MyCollection", query, "mongodb://localhost:27017/mydb")print(df)
The query accepts the same arguments as method aggregate of pymongo package.
11
Pandas Mongo, Release 0.1.0
12 Chapter 6. Reading dataframes from MongoDB using aggregation
CHAPTER 7
Reference
7.1 pdmongo
pdmongo.read_mongo(collection: str, query: List[Dict[str, Any]], db: Union[str, py-mongo.database.Database], index_col: Union[str, List[str], None] = None,extra: Optional[Dict[str, Any]] = None, chunksize: Optional[int] = None) →pandas.core.frame.DataFrame
Read MongoDB query into a DataFrame.
Returns a DataFrame corresponding to the result set of the query. Optionally provide an index_col parameter touse one of the columns as the index, otherwise default integer index will be used.
Parameters
• collection (str) – Mongo collection to select for querying
• query (list) – Must be an aggregate query. The input will be passed to pymongo .aggregate
• db (pymongo.database.Database or database string URI) – The database to use
• index_col (str or list of str, optional, default: None) – Column(s) to set as in-dex(MultiIndex).
• extra (dict, optional, default: None) – List of parameters to pass to aggregate method.
• chunksize (int, default None) – If specified, return an iterator where chunksize is the numberof docs to include in each chunk.
Returns Dataframe
pdmongo.to_mongo(frame: pandas.core.frame.DataFrame, name: str, db: Union[str, py-mongo.database.Database], if_exists: Optional[str] = ’fail’, index: Optional[bool]= True, index_label: Union[str, Sequence[str], None] = None, chunksize:Optional[int] = None) → Union[List[pymongo.results.InsertManyResult], py-mongo.results.InsertManyResult]
Write records stored in a DataFrame to a MongoDB collection.
Parameters
13
Pandas Mongo, Release 0.1.0
• frame (DataFrame, Series)
• name (str) – Name of collection.
• db (pymongo.database.Database or database string URI) – The database to write to
• if_exists ({‘fail’, ‘replace’, ‘append’}, default ‘fail’) –
– fail: If table exists, do nothing.
– replace: If table exists, drop it, recreate it, and insert data.
– append: If table exists, insert data. Create if does not exist.
• index (boolean, default True) – Write DataFrame index as a column.
• index_label (str or sequence, optional) – Column label for index column(s). If None isgiven (default) and index is True, then the index names are used. A sequence should begiven if the DataFrame uses MultiIndex.
• chunksize (int, optional) – Specify the number of rows in each batch to be written at a time.By default, all rows will be written at once.
14 Chapter 7. Reference
CHAPTER 8
Contributing
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
8.1 Bug reports
When reporting a bug please include:
• Your operating system name and version.
• Any details about your local setup that might be helpful in troubleshooting.
• Detailed steps to reproduce the bug.
8.2 Documentation improvements
Pandas Mongo could always use more documentation, whether as part of the official Pandas Mongo docs, in docstrings,or even on the web in blog posts, articles, and such.
8.3 Feature requests and feedback
The best way to send feedback is to file an issue at https://github.com/pakallis/python-pandas-mongo/issues.
If you are proposing a feature:
• Explain in detail how it would work.
• Keep the scope as narrow as possible, to make it easier to implement.
• Remember that this is a volunteer-driven project, and that code contributions are welcome :)
15
Pandas Mongo, Release 0.1.0
8.4 Development
To set up python-pandas-mongo for local development:
1. Fork python-pandas-mongo (look for the “Fork” button).
2. Clone your fork locally:
git clone [email protected]:pakallis/python-pandas-mongo.git
3. Create a branch for local development:
git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
4. When you’re done making changes run all the checks and docs builder with tox one command:
tox
5. Commit your changes and push your branch to GitHub:
git add .git commit -m "Your detailed description of your changes."git push origin name-of-your-bugfix-or-feature
6. Submit a pull request through the GitHub website.
8.4.1 Pull Request Guidelines
If you need some code review or feedback while you’re developing the code just make the pull request.
For merging, you should:
1. Include passing tests (run tox)1.
2. Update documentation when there’s new API, functionality etc.
3. Add a note to CHANGELOG.rst about the changes.
4. Add yourself to AUTHORS.rst.
8.4.2 Tips
To run a subset of tests:
tox -e envname -- pytest -k test_myfeature
To run all the test environments in parallel (you need to pip install detox):
detox
1 If you don’t have all the necessary python versions available locally you can rely on Travis - it will run the tests for each change you add in thepull request.
It will be slower though . . .
16 Chapter 8. Contributing
Pandas Mongo, Release 0.1.0
18 Chapter 9. Authors
CHAPTER 10
Changelog
10.1 0.1.0 (2020-05-05)
• Added static typing
• Added mypy to travis CI
• Removed unecessary params
10.2 0.0.2 (2020-05-04)
• Dropped support for pypy3
10.3 0.0.1 (2020-04-30)
• Added read_mongo and basic support for reading MongoDB collections into pandas dataframes
• Added to_mongo and basic support for writing pandas dataframes in MongoDB collections
10.4 0.0.0 (2020-03-22)
• First release on PyPI.
19
Pandas Mongo, Release 0.1.0
20 Chapter 10. Changelog
CHAPTER 11
Indices and tables
• genindex
• modindex
• search
21
Pandas Mongo, Release 0.1.0
22 Chapter 11. Indices and tables
Python Module Index
ppdmongo, 13
23
Pandas Mongo, Release 0.1.0
24 Python Module Index
Index
Ppdmongo (module), 13
Rread_mongo() (in module pdmongo), 13
Tto_mongo() (in module pdmongo), 13
25