datasets and big data

By the third class, students will select or be assigned a company to research. The project proposal is a 2-page, double spaced summary that answers the questions below.

1. Issue. What organization and issue? Select an organization and identify an issue that is generally considered important to that organization. What issue or problem will you focus on? Develop a question and at least one hypothesis related to this issue. Try to select issues with clear context and boundaries. Issues that are too vague or too specific will be difficult if not impossible to analyze and visualize.(Example: Property values, population, and median income are all in decline in the South New Jersey area, Gloucester, Salem, and Cumberland counties.)

2. Data. Where will you get your data? Identify authoritative data source(s) likely to provide the answer to the question/hypotheses.

Review the data sources’ documentation. Identify any problems or concerns.

3. Anticipated Problems. Do you expect any problems with the data? Describe how you will deal with any problems with the data.

4. Visualization Plan. What is your plan for visualizing the results of your analysis? Decide what narrative you wish to tell and how you intend to present the data. (It is understood that once you get deeper into the data analysis using Tableau that good intentions cannot all be realized.)

Update: The biggest issue will be finding appropriate datasets to make your project feasible. I would suggest working backwards from available datasets. More about that in a moment.

The current assignment says identify a company. That is perhaps too restrictive; companies are not always agreeable about sharing information. It is acceptable to raise the scope to industry or market level. It is also acceptable to consider consumer behavior.

With these two thoughts in mind, changing the scope and starting with available datasets, particularly ones that don’t charge for access, you may want to use public domain datasets from government sites or a research site like PEW Research (also free access and download). With wider scope and using available public datasets, you should be able to find a topic that is interesting and has depiction value.