Purpose

This document is required to indicate where various requirements can be found within your Final Project Report Rmd. You must indicate line numbers as they appear in your final Rmd document accompanying each of the following required tasks. Points will be deducted if line numbers are missing or differ signficantly from the submitted Final Rmd document.

Final Project Requirements

Data Access

Description: (1) Analysis includes at least two different data sources. (2) Primary data source may NOT be loaded from an R package–though supporting data may. (3) Access to all data sources is contained within the analysis. (4) Imported data is inspected at beginning of analysis using one or more R functions: e.g., str, glimpse, head, tail, names, nrow, etc

  1. .Rmd Line numbers where at least two different data sources are imported:

34-36

  1. .Rmd Line numbers for inspecting data intake:

48-77

Data Wrangling (5 out of 8 required)

Description: Students need not use every function and method introduced in STAT 184, but clear demonstration of proficiency should include proper use of 5 out of the following 8 topics from class: (+) various data verbs for general data wrangling like filter, mutate, summarise, arrange, group_by, etc. (+) joins for multiple data tables. (+) spread & gather to stack/unstack variables (+) regular expressions (+) reduction and/or transformation functions like mean, sum, max, min, n(), rank, pmin, etc. (+) user-defined functions (+) loops and control flow (+) machine learning

  1. .Rmd Line number(s) for general data wrangling: -

99-126

  1. .Rmd Line number(s) for a join operation: -

130-135

  1. .Rmd Line number(s) for a spread or gather operation (or equivalent): -

169-205

  1. .Rmd Line number(s) for use of regular expressions: -

148-165

  1. .Rmd Line number(s) for use of reduction and/or transformation functions: -

139-143

  1. .Rmd Line number(s) for use of user-defined functions:

  2. .Rmd Line number(s) for use of loops and/or control flow:

  3. .Rmd Line number(s) for use of machine learning (not “wrangling” but scored here):

Data Visualization (3 of 5 required)

Description: Students need not use every function and method introduced in STAT 184, but clear demonstration of proficiency should include a range of useful of data visualizations that are (1) relevant to stated research question for the analysis, (2) include at least one effective display of many–at least 3–variables, and (3) include 3 of the following 5 visualization techniques learned in STAT 184: (+) use of multiple geoms such as points, density, lines, segments, boxplots, bar charts, histograms, etc (+) use of multiple aesthetics–not necessarily all in the same graph–such as color, size, shape, x/y position, facets, etc (+) layered graphics such as points and accompanying smoother, points and accompanying boxplots, overlaid density distributions, etc (+) leaflet maps (+) decision tree and/or dendogram displaying machine learning model results

  1. .Rmd Line number(s) for use of mulitple different geoms:

212-263

  1. .Rmd Line number(s) for use of multiple aesthetics:

212-263

  1. .Rmd Line number(s) for use of layered graphics:

212-263

  1. .Rmd Line number(s) for use of leaflet maps:

  2. .Rmd Line number(s) for use of decision tree or dendogram results:

Other requirements (Nothing for you to report in this Guidance Document)

  1. All data visualizations must be relevant to the stated research question, and the report must include at least one effective display of many–at least 3–variables

  2. Code quality: Code formatting is consistent with Style Guide Appendix of DataComputing eBook. Specifically, all code chunks demonstrate proficiency with (1) meaningful object names (2) proper use of white space especially with respect to infix operators, chain operators, commas, brackets/parens, etc (3) use of <- assignment operator throughout (4) use of meaningful comments.

  3. Narrative quality: The narrative text (1) clearly states one research question that motivates the overall analysis, (2) explains reasoning for each significant step in the analysis and it’s relationship to the research question, (3) explains significant findings and conclusions as they relate to the research question, and (4) is completely free of errors in spelling and grammar

  4. Overall Quality: Submitted project shows significant effort to produce a high-quality and thoughtful analysis that showcases STAT 184 skills. (2) The project must be self-contained, such that the analysis can be entirely rerun without errors. (3) Analysis is coherent, well-organized, and free of extraneous content such as data dumps, unrelated graphs, and other content that is not overtly connected to the research question.

  5. EXTRA CREDIT (1) Project is submitted as a self-contained GitHub Repo (2) project submission is a functioning github.io webpage generated for the project Repo. Note: a link to the GitHub Repo itself will be awarded partial credit, but does not itself qualify as a “webpage” of the analysis.

LS0tCnRpdGxlOiAnRmluYWwgUHJvamVjdDogR3VpZGFuY2UgRG9jdW1lbnQnCmF1dGhvcjogIkF1dGhvciBOYW1lKHMpIgpkYXRlOiAiRHVlIERlY2VtYmVyIDExLCAyMDIwIgpvdXRwdXQ6ICBodG1sX25vdGVib29rCi0tLQoKIyBQdXJwb3NlCgoqVGhpcyBkb2N1bWVudCBpcyByZXF1aXJlZCB0byBpbmRpY2F0ZSB3aGVyZSB2YXJpb3VzIHJlcXVpcmVtZW50cyBjYW4gYmUgZm91bmQgd2l0aGluIHlvdXIgRmluYWwgUHJvamVjdCBSZXBvcnQgUm1kLiAgWW91IG11c3QqICoqaW5kaWNhdGUgbGluZSBudW1iZXJzIGFzIHRoZXkgYXBwZWFyIGluIHlvdXIgZmluYWwgUm1kIGRvY3VtZW50KiogKmFjY29tcGFueWluZyBlYWNoIG9mIHRoZSBmb2xsb3dpbmcgcmVxdWlyZWQgdGFza3MuIFBvaW50cyB3aWxsIGJlIGRlZHVjdGVkIGlmIGxpbmUgbnVtYmVycyBhcmUgbWlzc2luZyBvciBkaWZmZXIgc2lnbmZpY2FudGx5IGZyb20gdGhlIHN1Ym1pdHRlZCBGaW5hbCBSbWQgZG9jdW1lbnQuKiAgCgoKIyBGaW5hbCBQcm9qZWN0IFJlcXVpcmVtZW50cwoKCiMjIyBEYXRhIEFjY2VzcwoKKkRlc2NyaXB0aW9uOiAoMSkgQW5hbHlzaXMgaW5jbHVkZXMgYXQgbGVhc3QgdHdvIGRpZmZlcmVudCBkYXRhIHNvdXJjZXMuICgyKSBQcmltYXJ5IGRhdGEgc291cmNlIG1heSBOT1QgYmUgbG9hZGVkIGZyb20gYW4gUiBwYWNrYWdlLS10aG91Z2ggc3VwcG9ydGluZyBkYXRhIG1heS4gKDMpIEFjY2VzcyB0byBhbGwgZGF0YSBzb3VyY2VzIGlzIGNvbnRhaW5lZCB3aXRoaW4gdGhlIGFuYWx5c2lzLiAoNCkgSW1wb3J0ZWQgZGF0YSBpcyBpbnNwZWN0ZWQgYXQgYmVnaW5uaW5nIG9mIGFuYWx5c2lzIHVzaW5nIG9uZSBvciBtb3JlIFIgZnVuY3Rpb25zOiBlLmcuLCBzdHIsIGdsaW1wc2UsIGhlYWQsIHRhaWwsIG5hbWVzLCBucm93LCBldGMqCgooQSkgLlJtZCBMaW5lIG51bWJlcnMgd2hlcmUgYXQgbGVhc3QgdHdvIGRpZmZlcmVudCBkYXRhIHNvdXJjZXMgYXJlIGltcG9ydGVkOiAgCgo+MzQtMzYKCihCKSAuUm1kIExpbmUgbnVtYmVycyBmb3IgaW5zcGVjdGluZyBkYXRhIGludGFrZTogIAoKPjQ4LTc3CgojIyMgRGF0YSBXcmFuZ2xpbmcgKDUgb3V0IG9mIDggcmVxdWlyZWQpCgoqRGVzY3JpcHRpb246IFN0dWRlbnRzIG5lZWQgbm90IHVzZSBldmVyeSBmdW5jdGlvbiBhbmQgbWV0aG9kIGludHJvZHVjZWQgaW4gU1RBVCAxODQsIGJ1dCBjbGVhciBkZW1vbnN0cmF0aW9uIG9mIHByb2ZpY2llbmN5IHNob3VsZCBpbmNsdWRlIHByb3BlciB1c2Ugb2YgNSBvdXQgb2YgdGhlIGZvbGxvd2luZyA4IHRvcGljcyBmcm9tIGNsYXNzOiAoKykgdmFyaW91cyBkYXRhIHZlcmJzIGZvciBnZW5lcmFsIGRhdGEgd3JhbmdsaW5nIGxpa2UgZmlsdGVyLCBtdXRhdGUsIHN1bW1hcmlzZSwgYXJyYW5nZSwgZ3JvdXBfYnksIGV0Yy4gKCspIGpvaW5zIGZvciBtdWx0aXBsZSBkYXRhIHRhYmxlcy4gKCspIHNwcmVhZCAmIGdhdGhlciB0byBzdGFjay91bnN0YWNrIHZhcmlhYmxlcyAoKykgcmVndWxhciBleHByZXNzaW9ucyAoKykgcmVkdWN0aW9uIGFuZC9vciB0cmFuc2Zvcm1hdGlvbiBmdW5jdGlvbnMgbGlrZSBtZWFuLCBzdW0sIG1heCwgbWluLCBuKCksIHJhbmssIHBtaW4sIGV0Yy4gKCspIHVzZXItZGVmaW5lZCBmdW5jdGlvbnMgKCspIGxvb3BzIGFuZCBjb250cm9sIGZsb3cgKCspIG1hY2hpbmUgbGVhcm5pbmcqCgoKKEEpIC5SbWQgTGluZSBudW1iZXIocykgZm9yIGdlbmVyYWwgZGF0YSB3cmFuZ2xpbmc6IC0KCgo+OTktMTI2CgoKKEIpIC5SbWQgTGluZSBudW1iZXIocykgZm9yIGEgam9pbiBvcGVyYXRpb246IC0KCj4xMzAtMTM1CgooQykgLlJtZCBMaW5lIG51bWJlcihzKSBmb3IgYSBzcHJlYWQgb3IgZ2F0aGVyIG9wZXJhdGlvbiAob3IgZXF1aXZhbGVudCk6IC0KCj4xNjktMjA1CgooRCkgLlJtZCBMaW5lIG51bWJlcihzKSBmb3IgdXNlIG9mIHJlZ3VsYXIgZXhwcmVzc2lvbnM6IC0KCj4xNDgtMTY1CgooRSkgLlJtZCBMaW5lIG51bWJlcihzKSBmb3IgdXNlIG9mIHJlZHVjdGlvbiBhbmQvb3IgdHJhbnNmb3JtYXRpb24gZnVuY3Rpb25zOiAtCgo+MTM5LTE0MwoKCihGKSAuUm1kIExpbmUgbnVtYmVyKHMpIGZvciB1c2Ugb2YgdXNlci1kZWZpbmVkIGZ1bmN0aW9uczogCgoKKEcpIC5SbWQgTGluZSBudW1iZXIocykgZm9yIHVzZSBvZiBsb29wcyBhbmQvb3IgY29udHJvbCBmbG93OiAKCgooSCkgLlJtZCBMaW5lIG51bWJlcihzKSBmb3IgdXNlIG9mIG1hY2hpbmUgbGVhcm5pbmcgKG5vdCAid3JhbmdsaW5nIiBidXQgc2NvcmVkIGhlcmUpOiAKCgoKIyMjIERhdGEgVmlzdWFsaXphdGlvbiAoMyBvZiA1IHJlcXVpcmVkKQoKKkRlc2NyaXB0aW9uOiBTdHVkZW50cyBuZWVkIG5vdCB1c2UgZXZlcnkgZnVuY3Rpb24gYW5kIG1ldGhvZCBpbnRyb2R1Y2VkIGluIFNUQVQgMTg0LCBidXQgY2xlYXIgZGVtb25zdHJhdGlvbiBvZiBwcm9maWNpZW5jeSBzaG91bGQgaW5jbHVkZSBhIHJhbmdlIG9mIHVzZWZ1bCBvZiBkYXRhIHZpc3VhbGl6YXRpb25zIHRoYXQgYXJlICgxKSByZWxldmFudCB0byBzdGF0ZWQgcmVzZWFyY2ggcXVlc3Rpb24gZm9yIHRoZSBhbmFseXNpcywgKDIpIGluY2x1ZGUgYXQgbGVhc3Qgb25lIGVmZmVjdGl2ZSBkaXNwbGF5IG9mIG1hbnktLWF0IGxlYXN0IDMtLXZhcmlhYmxlcywgYW5kICgzKSBpbmNsdWRlIDMgb2YgdGhlIGZvbGxvd2luZyA1IHZpc3VhbGl6YXRpb24gdGVjaG5pcXVlcyBsZWFybmVkIGluIFNUQVQgMTg0OiAoKykgdXNlIG9mIG11bHRpcGxlIGdlb21zIHN1Y2ggYXMgcG9pbnRzLCBkZW5zaXR5LCBsaW5lcywgc2VnbWVudHMsIGJveHBsb3RzLCBiYXIgY2hhcnRzLCBoaXN0b2dyYW1zLCBldGMgKCspIHVzZSBvZiBtdWx0aXBsZSBhZXN0aGV0aWNzLS1ub3QgbmVjZXNzYXJpbHkgYWxsIGluIHRoZSBzYW1lIGdyYXBoLS1zdWNoIGFzIGNvbG9yLCBzaXplLCBzaGFwZSwgeC95IHBvc2l0aW9uLCBmYWNldHMsIGV0YyAoKykgbGF5ZXJlZCBncmFwaGljcyBzdWNoIGFzIHBvaW50cyBhbmQgYWNjb21wYW55aW5nIHNtb290aGVyLCBwb2ludHMgYW5kIGFjY29tcGFueWluZyBib3hwbG90cywgb3ZlcmxhaWQgZGVuc2l0eSBkaXN0cmlidXRpb25zLCBldGMgKCspIGxlYWZsZXQgbWFwcyAoKykgZGVjaXNpb24gdHJlZSBhbmQvb3IgZGVuZG9ncmFtIGRpc3BsYXlpbmcgbWFjaGluZSBsZWFybmluZyBtb2RlbCByZXN1bHRzKgoKCihBKSAuUm1kIExpbmUgbnVtYmVyKHMpIGZvciB1c2Ugb2YgbXVsaXRwbGUgZGlmZmVyZW50IGdlb21zOiAgCgo+MjEyLTI2MwoKKEIpIC5SbWQgTGluZSBudW1iZXIocykgZm9yIHVzZSBvZiBtdWx0aXBsZSBhZXN0aGV0aWNzOiAgCgo+MjEyLTI2MwoKKEMpIC5SbWQgTGluZSBudW1iZXIocykgZm9yIHVzZSBvZiBsYXllcmVkIGdyYXBoaWNzOiAgCgo+MjEyLTI2MwoKKEQpIC5SbWQgTGluZSBudW1iZXIocykgZm9yIHVzZSBvZiBsZWFmbGV0IG1hcHM6ICAKCihFKSAuUm1kIExpbmUgbnVtYmVyKHMpIGZvciB1c2Ugb2YgZGVjaXNpb24gdHJlZSBvciBkZW5kb2dyYW0gcmVzdWx0czogICAgCgoKCgojIyMgT3RoZXIgcmVxdWlyZW1lbnRzIChOb3RoaW5nIGZvciB5b3UgdG8gcmVwb3J0IGluIHRoaXMgR3VpZGFuY2UgRG9jdW1lbnQpCgooQSkgKkFsbCBkYXRhIHZpc3VhbGl6YXRpb25zKiBtdXN0IGJlIHJlbGV2YW50IHRvIHRoZSBzdGF0ZWQgcmVzZWFyY2ggcXVlc3Rpb24sIGFuZCB0aGUgcmVwb3J0IG11c3QgaW5jbHVkZSBhdCBsZWFzdCBvbmUgZWZmZWN0aXZlIGRpc3BsYXkgb2YgbWFueS0tYXQgbGVhc3QgMy0tdmFyaWFibGVzIAoKKEIpICpDb2RlIHF1YWxpdHk6KiBDb2RlIGZvcm1hdHRpbmcgaXMgY29uc2lzdGVudCB3aXRoIFN0eWxlIEd1aWRlIEFwcGVuZGl4IG9mIERhdGFDb21wdXRpbmcgZUJvb2suIFNwZWNpZmljYWxseSwgYWxsIGNvZGUgY2h1bmtzIGRlbW9uc3RyYXRlIHByb2ZpY2llbmN5IHdpdGggKDEpIG1lYW5pbmdmdWwgb2JqZWN0IG5hbWVzICgyKSBwcm9wZXIgdXNlIG9mIHdoaXRlIHNwYWNlIGVzcGVjaWFsbHkgd2l0aCByZXNwZWN0IHRvIGluZml4IG9wZXJhdG9ycywgY2hhaW4gb3BlcmF0b3JzLCBjb21tYXMsIGJyYWNrZXRzL3BhcmVucywgZXRjICgzKSB1c2Ugb2YgYDwtYCBhc3NpZ25tZW50IG9wZXJhdG9yIHRocm91Z2hvdXQgKDQpIHVzZSBvZiBtZWFuaW5nZnVsIGNvbW1lbnRzLgoKKEMpICpOYXJyYXRpdmUgcXVhbGl0eToqIFRoZSBuYXJyYXRpdmUgdGV4dCAoMSkgY2xlYXJseSBzdGF0ZXMgb25lIHJlc2VhcmNoIHF1ZXN0aW9uIHRoYXQgbW90aXZhdGVzIHRoZSBvdmVyYWxsIGFuYWx5c2lzLCAoMikgZXhwbGFpbnMgcmVhc29uaW5nIGZvciBlYWNoIHNpZ25pZmljYW50IHN0ZXAgaW4gdGhlIGFuYWx5c2lzIGFuZCBpdCdzIHJlbGF0aW9uc2hpcCB0byB0aGUgcmVzZWFyY2ggcXVlc3Rpb24sICgzKSBleHBsYWlucyBzaWduaWZpY2FudCBmaW5kaW5ncyBhbmQgY29uY2x1c2lvbnMgYXMgdGhleSByZWxhdGUgdG8gdGhlIHJlc2VhcmNoIHF1ZXN0aW9uLCBhbmQgKDQpIGlzIGNvbXBsZXRlbHkgZnJlZSBvZiBlcnJvcnMgaW4gc3BlbGxpbmcgYW5kIGdyYW1tYXIKCihEKSAqT3ZlcmFsbCBRdWFsaXR5OiogU3VibWl0dGVkIHByb2plY3Qgc2hvd3Mgc2lnbmlmaWNhbnQgZWZmb3J0IHRvIHByb2R1Y2UgYSBoaWdoLXF1YWxpdHkgYW5kIHRob3VnaHRmdWwgYW5hbHlzaXMgdGhhdCBzaG93Y2FzZXMgU1RBVCAxODQgc2tpbGxzLiAoMikgVGhlIHByb2plY3QgbXVzdCBiZSBzZWxmLWNvbnRhaW5lZCwgc3VjaCB0aGF0IHRoZSBhbmFseXNpcyBjYW4gYmUgZW50aXJlbHkgcmVydW4gd2l0aG91dCBlcnJvcnMuICgzKSBBbmFseXNpcyBpcyBjb2hlcmVudCwgd2VsbC1vcmdhbml6ZWQsIGFuZCBmcmVlIG9mIGV4dHJhbmVvdXMgY29udGVudCBzdWNoIGFzIGRhdGEgZHVtcHMsIHVucmVsYXRlZCBncmFwaHMsIGFuZCBvdGhlciBjb250ZW50IHRoYXQgaXMgbm90IG92ZXJ0bHkgY29ubmVjdGVkIHRvIHRoZSByZXNlYXJjaCBxdWVzdGlvbi4KCihFKSAqRVhUUkEgQ1JFRElUKiAoMSkgUHJvamVjdCBpcyBzdWJtaXR0ZWQgYXMgYSBzZWxmLWNvbnRhaW5lZCBHaXRIdWIgUmVwbyAoMikgcHJvamVjdCBzdWJtaXNzaW9uIGlzIGEgZnVuY3Rpb25pbmcgZ2l0aHViLmlvIHdlYnBhZ2UgZ2VuZXJhdGVkIGZvciB0aGUgcHJvamVjdCBSZXBvLiBOb3RlOiBhIGxpbmsgdG8gdGhlIEdpdEh1YiBSZXBvIGl0c2VsZiB3aWxsIGJlIGF3YXJkZWQgcGFydGlhbCBjcmVkaXQsIGJ1dCBkb2VzIG5vdCBpdHNlbGYgcXVhbGlmeSBhcyBhICJ3ZWJwYWdlIiBvZiB0aGUgYW5hbHlzaXMuCgoKCg==