Identifying and Correcting Data Anomalies in E-commerce Customer Data

Problem:

E-commerce businesses often struggle with inaccurate or incomplete customer data, which can lead to inefficiencies in marketing campaigns, customer service interactions, and fraud detection efforts.

Solution:

Implementing EDA techniques allows e-commerce businesses to identify and correct data anomalies in customer records, such as missing values, invalid formats, and outliers.

Results:

A 20% reduction in customer data errors, leading to increased accuracy in marketing campaigns, customer segmentation, and fraud detection.
A 15% improvement in customer service efficiency, as accurate customer data enables faster resolution of inquiries and issue

An Exploratory data analysis APP

TAPIWA CHAMBOKO

🚀 About Me

I'm a full stack developer experienced in deploying artificial intelligence powered apps

Authors

@Tapiwa chamboko

Acknowledgements

dataprofessor
Pandas Profiling in Data Science

Demo

Live demo

Click here for Live demo

Installation

Install required packages

  pip install streamlit
  pip install pycaret
  pip insatll scikit-learn==0.23.2
  pip install numpy
  pip install seaborn 
  pip install pandas
  pip install matplotlib
  pip install plotly-express
  pip install streamlit-lottie

Datasets

Drop your Datasets in the app to get resuilts
you can use he exaple data provided in the app

Code

import streamlit as st
import pandas as pd  
import plotly.express as px  
import base64  
from io import StringIO, BytesIO  
import numpy as np
import pandas as pd
from sklearn import datasets
import matplotlib.pyplot as plt
from pandas_profiling import ProfileReport
from streamlit_pandas_profiling import st_profile_report

def app():
    st.markdown('''
# **Exploratory data analysis App**
Please upload your xlsx file or click the button below to use example dataset
---
''')

# Upload CSV data
    with st.sidebar.header('Upload your XLSX data'):
        uploaded_file = st.sidebar.file_uploader("Upload your input XLSX file", type=["xlsx"])
       

    # Pandas Profiling Report
    if uploaded_file is not None:
        @st.cache
        def load_csv():
            csv = pd.read_excel(uploaded_file,engine='openpyxl')
            #csv = pd.read_csv(uploaded_file,encoding='latin1', index_col=None,usecols = "A,B,C,D,E,F,H,G,H,I,J")
            return csv
        df = load_csv()
        pr = ProfileReport(df, explorative=True)
        st.header('**Input DataFrame**')
        st.write(df)
        st.write('---')
        st.header('**Exploratory data analysis Report**')
        st_profile_report(pr)
        
    else:
        st.info('Awaiting for XLSX file to be uploaded.')
        
        if st.button('Press to use Example Dataset'):
            # Example data
            @st.cache
            def load_data():
                a = pd.DataFrame(
                    np.random.rand(100, 5),
                    columns=['a', 'b', 'c', 'd', 'e']
                )
                return a
            df = load_data()
            pr = ProfileReport(df, explorative=True)
            st.header('**Input DataFrame**')
            st.write(df)
            st.write('---')
            st.header('**Exploratory data analysis Report**')
            st_profile_report(pr)

Deployment

To deploy this project we used streamlit to create Web App

Run this code below

  streamlit run app.py

Appendix

Happy Coding!!!!!!

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.devcontainer		.devcontainer
__pycache__		__pycache__
apps		apps
1.png		1.png
10.png		10.png
11.png		11.png
12.png		12.png
2.png		2.png
3.png		3.png
4.png		4.png
5.png		5.png
6.png		6.png
7.png		7.png
8.png		8.png
9.png		9.png
README.md		README.md
android-chrome-384x384.png		android-chrome-384x384.png
app.py		app.py
multiapp.py		multiapp.py
requirements.txt		requirements.txt

tapiwachamb/EDA

Folders and files

Latest commit

History

Repository files navigation