Data Frames are widely used and useful structures for data wrangling. The querier exposes a query language for Python pandas Data Frames, inspired from SQL's relational databases querying logic.
Installing | Package description | Contributing | Tests | API Documentation | Dependencies | License
- From Pypi:
pip install querier - From Github, for the development version:
pip install git+https://github.com/thierrymoudiki/querier.gitThere are currently 9 types of operations available in the querier, with no plan to extend that list much further (to maintain a relatively simple mental model). These verbs will look familiar to dplyr users, but the implementation (I used numpy, pandas and SQLite3) and functions' signatures are different:
concat: concatenates 2 Data Frames, either horizontally or verticallydelete: deletes rows from a Data Frame based on given criteriadrop: drops columns from a Data Framefiltr: filters rows of the Data Frame based on given criteriajoin: joins 2 Data Frames based on given criteria (available for completeness of the interface, this operation is already straightforward in pandas)select: selects columns from the Data Framesummarize: obtains summaries of data based on grouping columnsupdate: updates a column, using an operation given by the userrequest: for operations more complex than the previous 8 ones, makes it possible to use a SQL query on the Data Frame
The following notebooks present examples of use of the querier:
concatexampledeleteexampledropexamplefiltrexamplejoinexampleselectexamplesummarizeexampleupdateexamplerequestexample
Your contributions are welcome, and valuable. Please, make sure to read the Code of Conduct first.
If you're not comfortable with Git/Version Control yet, please use this form.
In Pull Requests, let's strive to use black for formatting:
pip install black
black --line-length=80 file_submitted_for_pr.pyTBD
https://techtonique.github.io/querier/
- Numpy
- Pandas
- SQLite3
BSD 3-Clause © Thierry Moudiki, 2019.
