'A Function That Check for Complete Cases in Several File Fail When Specifing A File Range

I have created a function that checks a string of CSV files for complete values and creates a table with these values. The function takes 2 arguments, the first one is the containing folder, the second one is the range of files you wanna check (files are numbers from 1 to 332). if I run the function for the whole file string 1:332 it will work

> complete <- function(pathway, id=1:332){
+   files_list <- list.files(path = pathway, pattern = ".csv", full.names = TRUE)
+   final_result <- data.frame(id = id, nobs = as.numeric(NA))
+   for(i in id){
+     storefiles <- read.csv(files_list[i])
+     final_result$nobs[i] <- sum(complete.cases(storefiles))
+   }
+   return(final_result)
+ }
 complete("specdata")
     id nobs
1     1  117
2     2 1041
3     3  243
4     4  474
5     5  402
6     6  228
7     7  442
8     8  192
9     9  275
10   10  148
11   11  443
12   12   96
13   13   46
14   14   96
15   15   83
16   16   60
17   17  927
18   18   84
19   19  353
20   20  124
21   21  426
22   22  135
23   23  492
24   24  885
25   25  463
26   26  586
27   27  338
28   28  475
29   29  711
30   30  932
31   31  483
32   32  616
33   33  466
34   34  165
35   35  509
36   36  495
37   37  497
38   38  491
39   39  734
40   40   21
41   41  227
42   42   60
43   43   74
44   44  283
45   45  424
46   46   89
47   47  540
48   48   62
49   49  473
50   50  459
51   51  193
52   52  812
53   53  342
54   54  219
55   55  372
56   56  642
57   57  452
58   58  391
59   59  445
60   60  448
61   61  155
62   62  414
63   63  403
64   64  932
65   65   66
66   66  374
67   67  436
68   68  418
69   69   15
70   70  124
71   71  360
72   72  406
73   73   60
74   74  462
75   75  779
76   76  385
77   77  345
78   78  275
79   79  132
80   80  302
81   81  100
82   82   96
83   83  280
84   84  169
85   85  120
86   86  462
87   87  374
88   88  312
89   89  858
90   90   56
91   91  366
92   92   49
93   93  393
94   94  309
95   95   81
96   96  873
97   97  243
98   98  736
99   99  479
100 100  104
101 101   64
102 102  145
103 103  439
104 104  385
105 105  237
106 106   74
107 107    7
108 108  454
109 109  223
110 110  234
111 111  329
112 112  242
113 113  348
114 114  753
115 115  177
116 116  806
117 117  285
118 118   12
119 119  150
120 120  687
121 121  828
122 122  178
123 123  349
124 124  277
125 125  422
126 126  108
127 127  428
128 128  390
129 129   53
130 130  101
131 131  201
132 132  260
133 133  978
134 134  100
135 135   41
136 136  397
137 137    2
138 138  847
139 139  241
140 140  407
141 141  183
142 142  209
143 143  194
144 144  682
145 145  226
146 146   53
147 147  302
148 148  822
149 149  164
150 150  685
151 151  233
152 152  650
153 153  177
154 154 1095
155 155   79
156 156  298
157 157   89
158 158  671
159 159   72
160 160  646
161 161  145
162 162    2
163 163   73
164 164  151
165 165  761
166 166  207
167 167  558
168 168  523
169 169   17
170 170  741
171 171  614
172 172  776
173 173  234
174 174  278
175 175    3
176 176  621
177 177  971
178 178  422
179 179  551
180 180  576
181 181  286
182 182  465
183 183  332
184 184  816
185 185  923
186 186  432
187 187    4
188 188  431
189 189  174
190 190  342
191 191  463
192 192  154
193 193  317
194 194  928
195 195  304
196 196  753
197 197  334
198 198  858
199 199  217
200 200  460
201 201  409
202 202  661
203 203  431
204 204  125
205 205  361
206 206   73
207 207   51
208 208  121
209 209  151
210 210  311
211 211  121
212 212   51
213 213  345
214 214  431
215 215  469
216 216  251
217 217  181
218 218  216
219 219  182
220 220  407
221 221  199
222 222  445
223 223  899
224 224   16
225 225  332
226 226   16
227 227  278
228 228  548
229 229  431
230 230  180
231 231  470
232 232  886
233 233  137
234 234  253
235 235   12
236 236   52
237 237  146
238 238   74
239 239  410
240 240  407
241 241  439
242 242  344
243 243  398
244 244  440
245 245  432
246 246  428
247 247  191
248 248 1005
249 249  230
250 250  180
251 251   27
252 252  509
253 253  531
254 254  437
255 255  657
256 256   96
257 257  886
258 258  444
259 259   76
260 260  386
261 261   50
262 262  245
263 263  357
264 264   44
265 265  438
266 266  439
267 267  403
268 268  424
269 269  191
270 270  411
271 271  499
272 272  253
273 273  203
274 274    4
275 275    0
276 276    0
277 277  908
278 278    0
279 279  822
280 280   15
281 281   81
282 282   92
283 283   90
284 284   87
285 285   83
286 286    0
287 287  812
288 288   40
289 289    0
290 290   91
291 291    0
292 292    0
293 293    0
294 294    0
295 295   75
296 296   14
297 297   10
298 298   66
299 299  331
300 300  927
301 301  438
302 302  937
303 303  585
304 304  135
305 305  263
306 306  203
307 307  174
308 308   79
309 309  213
310 310  232
311 311   65
312 312  216
313 313  368
314 314  888
315 315  183
316 316   77
317 317   47
318 318  200
319 319  113
320 320  627
321 321  353
322 322  301
323 323   34
324 324   34
325 325  817
326 326  215
327 327  162
328 328  967
329 329  439
330 330  447
331 331  284
332 332   16

But if I wanted to check files that not start by file number one (i.e. 2:22) it won't work, only if the first number was 1

> complete("specdata", 8:244)
 Error in `$<-.data.frame`(`*tmp*`, "nobs", value = c(NA, NA, NA, NA, NA, : 
replacement has 238 rows, data has 237
4.
stop(sprintf(ngettext(N, "replacement has %d row, data has %d", 
"replacement has %d rows, data has %d"), N, nrows), domain = NA)
3.
`$<-.data.frame`(`*tmp*`, "nobs", value = c(NA, NA, NA, NA, NA, 
NA, NA, 192, 275, 148, 443, 96, 46, 96, 83, 60, 927, 84, 353, 
124, 426, 135, 492, 885, 463, 586, 338, 475, 711, 932, 483, 616, 
466, 165, 509, 495, 497, 491, 734, 21, 227, 60, 74, 283, 424, ...
2.
`$<-`(`*tmp*`, "nobs", value = c(NA, NA, NA, NA, NA, NA, NA, 
192, 275, 148, 443, 96, 46, 96, 83, 60, 927, 84, 353, 124, 426, 
135, 492, 885, 463, 586, 338, 475, 711, 932, 483, 616, 466, 165, 
509, 495, 497, 491, 734, 21, 227, 60, 74, 283, 424, 89, 540, ...
1.
complete("specdata", 8:244)
r


Solution 1:[1]

You're getting this error because your "final_result" data frame is indexed from 1 when it is created, but your "id" numbers are not.

Try:

complete <- function(pathway, id=1:332){
  files_list <- list.files(path = pathway, pattern = ".csv", full.names = TRUE)
  final_result <- data.frame(id = id, nobs = as.numeric(NA))
  for(i in id){
    j <- which(id == i)
    storefiles <- read.csv(files_list[j])
    final_result$nobs[j] <- sum(complete.cases(storefiles))
  }
  return(final_result)
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Emmanuel