'Portfolio Returns in Panel Data with R
I have a Panel Data set of different portfolios with prices and I want to track the performance of each over a period of 10 years. I already calculated the returns with:
returnfun <- function(x) c(NA, diff(x) / head(x, -1))
Portfolio10pd=returns=transform(Portfolio10pd, Returns = ave(prccm, GVKEY, FUN = returnfun))
When the GVKEY changes it produces 1 "Na" because there is no old price as reference. So I cleaned the NA return columns with
P10=subset(NaRV.omit(Portfolio10pd))
head(P1, 10)
GVKEY datadate prccm Returns
2 1 20100228 292.00 -0.0009921653
3 1 20100331 299.13 0.0244178082
4 1 20100430 317.82 0.0624811955
5 1 20100531 324.37 0.0206091498
6 1 20100630 302.92 -0.0661281869
7 1 20100731 334.17 0.1031625512
8 1 20100831 304.55 -0.0886375198
9 1 20100930 315.78 0.0368740765
10 1 20101031 375.84 0.1901957059
> tail(P1,10)
GVKEY datadate prccm Returns
20322 144 20200331 32.74 -0.238781679
20323 144 20200430 32.62 -0.003665241
20324 144 20200531 30.38 -0.068669528
20325 144 20200630 32.27 0.062211982
20326 144 20200731 30.71 -0.048342113
20327 144 20200831 28.02 -0.087593618
20328 144 20200930 26.58 -0.051391863
20329 144 20201031 24.75 -0.068848758
20330 144 20201130 31.20 0.260606061
20331 144 20201231 31.68 0.015384615
I have 144 Firms with monthly returns from February 2010 until December 2020. I would like to calculate the monthly returns of the equally weighted Portfolio to track the performance over time. I cannot use Return.portfolio() function of the performance analytics package because the returns are not in the right format. Has somebody a hint for me or a solution for that problem? I am new to R, so thank u very much!
Solution 1:[1]
Does it really have to be that format? It is typically more efficient to merge the series on the timestamps and then compute the returns of a strategy. An example, for which I first create a small data set.
library("PMwR")
library("zoo")
data <- read.table(header = TRUE, sep = ",", text =
"GVKEY, datadate, prccm
1, 20100228, 292.00
1, 20100331, 299.13
1, 20100430, 317.82
1, 20100531, 324.37
1, 20100630, 302.92
2, 20100228, 28.02
2, 20100331, 26.58
2, 20100430, 24.75
2, 20100531, 31.20
2, 20100630, 31.68")
z <- by(data, data$GVKEY, function(x) zoo(x$prccm, x$datadate))
prices <- do.call(merge, z)
## 1 2
## 20100228 292.00 28.02
## 20100331 299.13 26.58
## 20100430 317.82 24.75
## 20100531 324.37 31.20
## 20100630 302.92 31.68
returns(prices,
weights = c(0.5, 0.5),
rebalance.when = 20100228)
## 20100331 20100430 20100531 20100630
## -0.0134870274 -0.0006607329 0.1281236339 -0.0253235197
Solution 2:[2]
Good Morning,
thanks for your help. I started formating everything new:
dput(x)
structure(list(GVKEY = c("002222", "002237", "002402", "002990",
"003691", "004130", "005336", "005902", "006130", "006185", "007183",
"007192", "007389", "007694", "009330", "010476", "011751", "012252",
"012711", "012713", "012717", "012731", "012744", "012942", "013472",
So I have a vector of gvkeys and a matrix of gvkeys and monthly dates of 10 years and monthly prices.
> head(Preise)
# A tibble: 6 x 3
GVKEY datadate prccm
<chr> <dbl> <dbl>
1 001003 20100131 0.02
2 001003 20100228 0.02
3 001003 20100331 0.02
4 001003 20100430 0.02
5 001003 20100531 0.02
6 001003 20100630 0.02
It looks like this. When I merged this it looks like the matrix I posted in my solutions. But I want to merge it and bring this to the format you suggest. What I need is a function which looks for the gvkeys of my list and compares them to the gvkeys in the matrix and merges them. So I have the wanted gvkeys with the monthly prices and I would like your the gvkeys as row names and the different dates as column names. So I can work with perfomance analytics to measure the performance.
Thank you very much!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Enrico Schumann |
| Solution 2 | Surfer98 |
