name: title class: center, middle #Tidy data# L. Paloma Rojas Saunero --- class: center, middle ![kondo](figs/mariakondo.jpg) --- class: center, middle ![map](figs/map.png) ```r library(tidyr) ``` --- ## The rules of Tidy data: .pull-left[ - **Rule 1**: Each **_variable_** must have its own **_column_**. <br><br><br><br><br><br> - **Rule 2**: Each **_observation_** must have its own **_row_**. <br><br><br><br><br><br> - **Rule 3**: Each **_value_** must have its own **_cell_**. ] .pull-right[ <center> ![rule](figs/rules.png) <center> ] .foot-note[https://r4ds.had.co.nz/] --- ### Quiz 1 Which of these tables meets the 3 rules of tidy data? .pull-left[ **Table A** <table class="table table-hover table-condensed table-responsive" style="font-size: 14px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> country </th> <th style="text-align:right;"> 1999 </th> <th style="text-align:right;"> 2000 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Afghanistan </td> <td style="text-align:right;"> 745 </td> <td style="text-align:right;"> 2666 </td> </tr> <tr> <td style="text-align:left;"> Brazil </td> <td style="text-align:right;"> 37737 </td> <td style="text-align:right;"> 80488 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 212258 </td> <td style="text-align:right;"> 213766 </td> </tr> </tbody> </table> **Table B** <table class="table table-hover table-condensed table-responsive" style="font-size: 14px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> country </th> <th style="text-align:right;"> year </th> <th style="text-align:left;"> rate </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Afghanistan </td> <td style="text-align:right;"> 1999 </td> <td style="text-align:left;"> 745/19987071 </td> </tr> <tr> <td style="text-align:left;"> Afghanistan </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:left;"> 2666/20595360 </td> </tr> <tr> <td style="text-align:left;"> Brazil </td> <td style="text-align:right;"> 1999 </td> <td style="text-align:left;"> 37737/172006362 </td> </tr> <tr> <td style="text-align:left;"> Brazil </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:left;"> 80488/174504898 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1999 </td> <td style="text-align:left;"> 212258/1272915272 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:left;"> 213766/1280428583 </td> </tr> </tbody> </table> ] .pull-right[ **Table C** <table class="table table-hover table-condensed table-responsive" style="font-size: 14px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> country </th> <th style="text-align:right;"> year </th> <th style="text-align:right;"> cases </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Afghanistan </td> <td style="text-align:right;"> 1999 </td> <td style="text-align:right;"> 745 </td> </tr> <tr> <td style="text-align:left;"> Afghanistan </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 2666 </td> </tr> <tr> <td style="text-align:left;"> Brazil </td> <td style="text-align:right;"> 1999 </td> <td style="text-align:right;"> 37737 </td> </tr> <tr> <td style="text-align:left;"> Brazil </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 80488 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1999 </td> <td style="text-align:right;"> 212258 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 213766 </td> </tr> </tbody> </table> ] --- ## Reshape: Wide to long _(cols to rows)_ ![w2l](figs/w2l_full.png) --- ## `pivot_longer()` .pull-left[ ![w2l_1](figs/w2l_step1.png) ] .pull-right[ ```r pivot_longer( * cols = c(`1991`, `2000`), * names_to = "year", values_to = "cases") ``` ] --- ## `pivot_longer()` .pull-left[ ![w2l_2](figs/w2l_step2.png) ] .pull-right[ ```r pivot_longer( cols, names_to = "year", * values_to = "cases") ``` ] --- ## Reshape: Long to wide _(rows to cols)_ ![l2w](figs/l2w_full.png) --- ## `pivot_wider()` .pull-left[ ![l2w_1](figs/l2w_step1.png)] .pull-right[ ```r pivot_wider( * names_from = year, values_from = cases) ``` ] --- ## `pivot_wider()` .pull-left[ ![l2w_1](figs/l2w_step2.png)] .pull-right[ ```r pivot_wider( names_from = year, * values_from = cases) ``` ] --- ### Quiz 2 We need to reshape the tibble that looks like table 1 to look as table 2. Fill in the blanks, correct if necessary: ```r table %>% pivot_____(names_from = "____", values_from = "____") ``` .pull-left[ **Table 1** <table class="table table-hover table-condensed table-responsive" style="font-size: 14px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> student </th> <th style="text-align:left;"> food </th> <th style="text-align:right;"> rate </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> fruit </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> vegetable </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> icecream </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> fruit </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> vegetable </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> icecream </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> fruit </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> vegetable </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> icecream </td> <td style="text-align:right;"> 4 </td> </tr> </tbody> </table> ] .pull-right[ **Table 2** <table class="table table-hover table-condensed table-responsive" style="font-size: 14px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> student </th> <th style="text-align:right;"> fruit </th> <th style="text-align:right;"> vegetable </th> <th style="text-align:right;"> icecream </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 4 </td> </tr> </tbody> </table> ]