BI Assignment using BigML

Crowd Funding

Introduction

PleaseFundThis.com is a web site that allows users to create projects in order to obtain

crowd-funding for creative pursuits such as such as films, music, stage shows, comics,

journalism, video games… and so on. Each project seeks monetary pledges from people.

Most often projects are offered tangible rewards and one-of-a-kind experiences in

exchange for their pledges.

Similar examples:

http://www.pozible.com/project/31823

https://www.gofundme.com/seespotshred

Data from PleaseFundThis has been gathered and is available in the file

named PleaseFundThis.xlsx.

The columns within the spreadsheet are:

_

project_name

_

number_of_pledgers

_

date_launched

_

comments_count

_

duration_days

_

avg_amt$_per_pledger

_

goal_$

_

project_has_video

_

percent_raised

_

project_has_facebook_page

_

project_state

_

facebook_friends_count

_

amt_pledged_$

_

project_has_pledge_rewards

_

major_category

_

lowest_pledge_level_$

_

minor_category

_

highest_pledge_level_$

_

project_updated_count

_

total_count_of_pledge_levels

_

city

_

success

_ region

Penny Robinson is a very talented young woman. She can act, sing, write, paint etc. In

fact she can do anything creative.

Penny’s only problem is that she doesn’t have any money. She wants to obtain money

via crowd-funding to fund a creative project. She knows that you have a spreadsheet

full of data. She want to know if you can give her any advice about creating a crowdfunded

project based on the data if the file. E.g. what will make her project more likely to succeed?

 

Tasks

Your tasks is to

Explore the data. Find information that you think will be useful to Penny (and to

other people who want to create similar crowd-funding projects).

Attempt to determine which (if any) what are the most important attributes of

the project in terms of whether a projects succeeds (meets the required funds)

List and present the most important attributes and justify why this is so (w

ith

the

use of BigML screen dumps and supporting explanation / discussion).

You will need to do this based around two models.

In the first model, use any / all of the columns in the dataset. However some columns

probably aren’t suitable for analysis. BigML may automatically choose some column not

to be suitable (e.g. Project Id – as every row has a unique value, it provides no analytical

use). In other case, you may need to select which columns are not suitable for analysis.

In the second model, you must exclude these columns: percent_raised,

amt_pledged_$, avg_amt$_per_pledge, project_state, number_of_pledgers &

project_update_count

Suggestion. Save the Excel file as a CSV file before loading into BigML. This generally

allows column headings to be used field names rather than ‘field1’, ‘field2’ …

find the cost of your paper