python

In your final project you will be addressing a research question that involves a supervised or unsupervised learning problem. Your goal is to submit a cohesive project report that conveys that you have mastered some of the data mining and machine learning techniques that we have discussed in class and that helps you answer your research question. You must use Python for your project, and can use any of the methods/libraries/algorithms used and discussed during this semester, or any other techniques you want to explore.
 
 
1. State the problem/area you will focus on for your final project.  You must think about the type of problem you would like to study for your project, and if it involves a classification (supervised learning), or clustering (unsupervised learning) type of framework. 
 
2. Describe the data set and perform an exploratory data analysis. The following sites can be used for finding a relevant data set 
 
US Government Public Data Sets: http://www.data.gov/
 
UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets.html
 
KDD Nuggets:  http://www.kdnuggets.com/datasets/
 
Perform relevant descriptive statistics, including summary statistics and/or visualization of the data. Think about what conditions you might need to check for your analysis or what summaries of the data might be useful your analysis. Load the data in Python and familiarize with it.
 
3. Make sure to address the following points:
 
â–ª Pre-processing: Describe any pre-processing you need to perform to the data set (PCA, missing values, inserting values, reducing the number for features, etc.) â–ª Data source: Include the citation for your data, and link to the source. â–ª Data collection: is there any information on how the data was collected? â–ª Variables: What are the variables you will be studying?  â–ª There is no limit on what tools or libraries you may use. Briefly explain what you will be using for your project.
MAT 602 – Applied Machine Learning  Spring 2017
St Thomas University 2
â–ª You are highly encouraged to use IPython (Jupyter Notebook) to produce your report. Final Report can be submitted in .html or .pdf format. 
 
4. Methods Make sure to include a brief description of the methods/algorithm you will be using (e.g. SVM, Logistic Regression, Random Forest, etc.). Describe the main characteristics of the algorithm and explain why you chose that specific method for your project.
 
Some examples of projects in Machine Learning include: • What factors cause retweets?  • Classification of Cardiac Arrhythmias Patients • Predicting NCAA basketball game outcomes and point differentials. • Determining the factors that are predictive of happiness. • Determining predictors for purchasing a bad deal at auctions. • Seizure Classification with EGG data.  • Predicting  type of wine to pair with a dinner. • Segmentation on MRI scans. • Prediction of Yelp Review Star Rating using Sentiment Analysis. • Movie recommender systems. • Beyond the Gender Wage Gap. • Identifying Gender from Facial Features. • Semi-supervised learning.
 
 
5. Conclusions You must use correct data mining and machine learning terminology, and must also explain your conclusions in a way that anyone can understand. Remember to interpret your findings in context of your research question.In your final project you will be addressing a research question that involves a supervised or unsupervised learning problem. Your goal is to submit a cohesive project report that conveys that you have mastered some of the data mining and machine learning techniques that we have discussed in class and that helps you answer your research question. You must use Python for your project, and can use any of the methods/libraries/algorithms used and discussed during this semester, or any other techniques you want to explore.
 
 
1. State the problem/area you will focus on for your final project.  You must think about the type of problem you would like to study for your project, and if it involves a classification (supervised learning), or clustering (unsupervised learning) type of framework. 
 
2. Describe the data set and perform an exploratory data analysis. The following sites can be used for finding a relevant data set 
 
US Government Public Data Sets: http://www.data.gov/
 
UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets.html
 
KDD Nuggets:  http://www.kdnuggets.com/datasets/
 
Perform relevant descriptive statistics, including summary statistics and/or visualization of the data. Think about what conditions you might need to check for your analysis or what summaries of the data might be useful your analysis. Load the data in Python and familiarize with it.
 
3. Make sure to address the following points:
 
â–ª Pre-processing: Describe any pre-processing you need to perform to the data set (PCA, missing values, inserting values, reducing the number for features, etc.) â–ª Data source: Include the citation for your data, and link to the source. â–ª Data collection: is there any information on how the data was collected? â–ª Variables: What are the variables you will be studying?  â–ª There is no limit on what tools or libraries you may use. Briefly explain what you will be using for your project. 

 
4. Methods Make sure to include a brief description of the methods/algorithm you will be using (e.g. SVM, Logistic Regression, Random Forest, etc.). Describe the main characteristics of the algorithm and explain why you chose that specific method for your project.
 
Some examples of projects in Machine Learning include: • What factors cause retweets?  • Classification of Cardiac Arrhythmias Patients • Predicting NCAA basketball game outcomes and point differentials. • Determining the factors that are predictive of happiness. • Determining predictors for purchasing a bad deal at auctions. • Seizure Classification with EGG data.  • Predicting  type of wine to pair with a dinner. • Segmentation on MRI scans. • Prediction of Yelp Review Star Rating using Sentiment Analysis. • Movie recommender systems. • Beyond the Gender Wage Gap. • Identifying Gender from Facial Features. • Semi-supervised learning.
 
 
5. Conclusions You must use correct data mining and machine learning terminology, and must also explain your conclusions in a way that anyone can understand. Remember to interpret your findings in context of your research question.

find the cost of your paper