.. slik-wrangler documentation master file, created by sphinx-quickstart on Thu Mar 18 18:16:23 2021. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Welcome to Slik-Wrangler documentation! ======================================== .. image:: images/slik.png :width: 300px :height: 200px :scale: 100 % :alt: alternate text :align: center slik-wrangler is a data to modeling tool that helps data scientists navigate the issues of basic data wrangling and preprocessing steps. The idea behind slik-wrangler is to jump-start supervised learning projects. Data scientists struggle to prepare their data for building machine learning models and all machine learning projects require data wrangling, data preprocessing, feature engineering which takes about 80% of the model building process. slik-wrangler has several tools that make it easy to load data of any format, clean and inspect your data. It offers a quick way to pre-process data and perform feature engineering. Building machine learning models is an inherently iterative task and data scientists face challenges of reproducing the models and productionalizing model pipelines. With slik-wrangler, Data scientists can build model pipelines. slik-wrangler provides explainability in the pipeline process in the form of DAG showing each step in the build process. With every build process/experiment, slik-wrangler logs the metadata for each run. slik-wrangler provides an easy-to-use solution for supervised machine learning. Here is a link to the staging repository. This project tries to help make supervised machine learning more accessible for beginners, and reduce boilerplate for common tasks. This library is in very active development, so it's not recommended for production use. Development at `github.com/AdesholaAfolabi/slik-wrangler_python_package/staging/ `_. Examples -------- A minimum example of using slik-wrangler for preprocessing is: >>> from slik_wrangler import preprocessing as pp >>> from sklearn.model_selection import train_test_split >>> from sklearn.datasets import titanic >>> X, y = titanic(return_X_y=True) >>> pp.preprocess(data=X,target_column='Survived',train=True,verbose=False,project_path='./Titanic'\ ,logging='display') >>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1) >>> lr = LogisticRegression.fit(X_train, y_train) Running ... >>> print("Accuracy score", lr.score(X_test, y_test)) Accuracy score 0.9... .. toctree:: :maxdepth: 3 :caption: Contents installs .. toctree:: :maxdepth: 3 :caption: Getting Started quick_start titanic.ipynb .. toctree:: :maxdepth: 2 :caption: slik-wrangler API: modules Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`