Questions

Question 1

Given the data Below, create 3 columns ie Sum_15, Sum_16, Sum_17 whereby sum_* is the sum of the values for the corresponding year and agegroup. The result should look like shown below:

AgeGroup	Sum_15	Sum_16	Sum_17
A	321	342	372
B	391	339	345
C	353	361	363
D	356	388	359
E	351	390	386

set.seed(100) # make results reproducible by setting seed
vars <- c("AgeGroup", paste0(month.abb, "_", rep(15:17, each = 12)))


(df <- cbind(LETTERS[1:5], matrix(rpois(n = (length(vars) - 1) * 5, 30), nrow = 5)) %>% 
    data.frame() %>%
    setNames(vars) %>% 
    tibble() %>% 
    mutate(across(-1, as.integer))
)

# A tibble: 5 × 37
  AgeGroup Jan_15 Feb_15 Mar_15 Apr_15 May_15 Jun_15 Jul_15 Aug_15 Sep_15 Oct_15
  <chr>     <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>
1 A            27     26     33     36     34     25     27     37     37     32
2 B            21     32     24     31     25     39     32     20     30     32
3 C            34     28     30     23     25     29     35     26     19     30
4 D            30     32     29     34     31     29     35     37     28     34
5 E            31     33     27     31     23     26     29     28     28     26
# ℹ 26 more variables: Nov_15 <int>, Dec_15 <int>, Jan_16 <int>, Feb_16 <int>,
#   Mar_16 <int>, Apr_16 <int>, May_16 <int>, Jun_16 <int>, Jul_16 <int>,
#   Aug_16 <int>, Sep_16 <int>, Oct_16 <int>, Nov_16 <int>, Dec_16 <int>,
#   Jan_17 <int>, Feb_17 <int>, Mar_17 <int>, Apr_17 <int>, May_17 <int>,
#   Jun_17 <int>, Jul_17 <int>, Aug_17 <int>, Sep_17 <int>, Oct_17 <int>,
#   Nov_17 <int>, Dec_17 <int>

Question 2

df1 <- structure(list(
  num_pp = c(1, 2, 3, 4, 5, 6), 
  nombre_dp1 = c(24,14, 2, 6, 6, 21), 
  nombre_dp05 = c(20, 28, 2, 9, 8, 21), 
  nombre_dp0 = c(24,20, 4, 11, 8, 20), 
  jugement_causal_dp1 = c("Oui", "Oui", "Oui","Oui", "Oui", "Oui"), 
  jugement_causal_dp05 = c("Non", "Oui","Non", "Non", "Oui", "Non"),
  jugement_causal_dp0 = c("Non", "Non","Oui", "Non", "Non", "Non"),
  confiance_dp1 = c(90, 80, 63, 80,90, 80),
  confiance_dp05 = c(60, 50, 86, 65, 50, 90),
  confiance_dp0 = c(65,60, 55, 43, 50, 80),
  age = c(33, 22, 20, 20, 18, 18),
  genre = c("Masculin","Feminin", "Feminin", "Feminin", "Feminin", "Feminin"),
  etude = c("L1","L1", "L1", "L1", "L1", "L1"),
  ordre = c("dp_05|dp_1|dp_0", "dp_0|dp_1|dp_05","dp_0|dp_1|dp_05", "dp_0|dp_05|dp_1", "dp_1|dp_05|dp_0", "dp_1|dp_05|dp_0"),
  wdif_dp1dp05 = c(-4, 14, 0, 3, 2, 0)),
  row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))

The dataset above represents a repeated measure design. ie It has multiple measures with each measure under 3 conditions.

For example, the measure nombre is repeated under conditions dp1, dp05 and dp0. In this dataset we have 3 measures ie:

matrix(str_subset(names(df1), "_dp\\d+$"), 3, byrow = TRUE)

     [,1]                  [,2]                   [,3]                 
[1,] "nombre_dp1"          "nombre_dp05"          "nombre_dp0"         
[2,] "jugement_causal_dp1" "jugement_causal_dp05" "jugement_causal_dp0"
[3,] "confiance_dp1"       "confiance_dp05"       "confiance_dp0"

Transform the data to have only the measures as the columns and include the conditions in their own column ie:

participant id	condition	measure1	measure2	measure3
1	1
1	5
1	0

Question 3

write regular expression patterns which will match all of the values in x and none of the values in y.

Ranges

x	y
abac	beam
accede	buoy
adead	canjac
babe	chymia
bead	corah
bebed	cupula
bedad	griece
bedded	hafter
bedead	idic
bedeaf	lucy
caba	martyr
caffa	matron
dace	messrs
dade	mucose
daff	relose
dead	sonly
deed	tegua
deface	threap
faded	towned
faff	widish
feed	yite

Back References

x	y
allochirally	anticker
anticovenanting	corundum
barbary	crabcatcher
calelectrical	damnably
entablement	foxtailed
ethanethiol	galvanotactic
froufrou	gummage
furfuryl	gurniad
galagala	hypergoddess
heavyheaded	kashga
linguatuline	nonimitative
mathematic	parsonage
monoammonium	pouchlike
perpera	presumptuously
photophonic	pylar
purpuraceous	rachioparalysis
salpingonasal	scherzando
testes	swayed
trisectrix	unbridledness
undergrounder	unupbraidingly
untaunted	wellside

Primes

x	y
xx	xxxx
xxx	xxxxxx
xxxxx	xxxxxxxx
xxxxxxx	xxxxxxxxx
xxxxxxxxxxx	xxxxxxxxxx
xxxxxxxxxxxxx	xxxxxxxxxxxx
xxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Question 4

Given the dataset below, transform it to have two columns, ie the Name column and the Year column:

Name
Percy Vere (2020)
Ginger Plant (2017)
Perry (2019)
Pat Thettick (2020)
Samuel (2022)
Fay Daway (2008)
Greg (2022)
Simon Sais (2011)