Retail Refine: Enhancing Retail Transaction Data for Advanced Analytics


Authors : Samir Pandey; Ami Shah

Volume/Issue : Volume 10 - 2025, Issue 3 - March


Google Scholar : https://tinyurl.com/yebzy7sz

Scribd : https://tinyurl.com/584xyn8a

DOI : https://doi.org/10.38124/ijisrt/25mar1342

Google Scholar

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 15 to 20 days to display the article.


Abstract : In the era of big data, high-quality data is essential for accurate analysis and decision-making. This paper explores the process of data cleaning and preparation for advanced analytics, focusing on techniques such as handling missing values, outlier detection, data transformation, and feature engineering. A case study is presented using a dataset to perform time series analysis, cohort segmentation, churn analysis, and customer segmentation. The goal is to enhance data reliability and usability for machine learning and predictive modeling.

Keywords : Data Cleaning, Data Preparation, Time Series Analysis, Cohort Segmentation, Churn Analysis, Outlier Detection, Feature Engineering.

References :

  1. Wes McKinney, "Python for Data Analysis," O'Reilly Media, 2017.
  2. Hastie, T., Tibshirani, R., & Friedman, J., "The Elements of Statistical Learning," Springer, 2009.
  3. J. Han, M. Kamber, & J. Pei, "Data Mining: Concepts and Techniques," Morgan Kaufmann, 2011.
  4. Kaggle Datasets, https://www.kaggle.com/
  5. Prophet Forecasting Model, https://facebook.github.io/prophet/

In the era of big data, high-quality data is essential for accurate analysis and decision-making. This paper explores the process of data cleaning and preparation for advanced analytics, focusing on techniques such as handling missing values, outlier detection, data transformation, and feature engineering. A case study is presented using a dataset to perform time series analysis, cohort segmentation, churn analysis, and customer segmentation. The goal is to enhance data reliability and usability for machine learning and predictive modeling.

Keywords : Data Cleaning, Data Preparation, Time Series Analysis, Cohort Segmentation, Churn Analysis, Outlier Detection, Feature Engineering.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe