Hedge Fund Data

Hedge funds, private investment vehicles limited to professional or accredited investors under regulations like the U.S. Securities ACT OF 1933, offer unique opportunities in the financial markets. Hedge fund data is made accessible through several major data vendors, including BarclayHedge, Eurekahedge, Morningstar, HFR, HFM and others. Estimating the total number of hedge funds remains is challenging due to conflicting calculations, such as considering fund subclasses separately and including or excluding Commodity Trading Advisors (CTAs) and hedge fund of funds. The estimated total number of hedge funds ranges from 15,000 to 17,000 when subclasses are excluded, and expands to approximately 56,000 when subclasses are included.

Hedge fund data comprises three main datasets: descriptive or qualitative information, pricing and assets under management (AUM) data, and holdings information.

Hedge Fund Qualitative Information

This dataset contains up to 4,000 data points, depending on the data vendor. The main mandatory fields include, for example, the following ones:

Fund NameFund Other FeeFund Prime Broker
Fund DomicileFund Highwater MarkFund Prime Broker Contact
Fund TypeFund Hurdle RateFund Prime Broker Phone
Fund StatusFund LeverageFund Auditor
Fund AUM USDFund Subscription FrequencyManager Name
Fund CurrencyFund Lockup PeriodManager Address1
Fund AUM Reported CurrencyFund Redemption FrequencyManager City
Fund Inception DateFund Redemption Notice Period DaysManager State Province
Fund Reports Daily DataFund Redemption PayoutManager Zip Code
Fund 13F Filling AvailableFund StrategyManager Country
Fund Investment MinimumFund AdvisorFund SEC Registered
Fund Reporting StyleFund Advisor ContactFund Uses Equilization
Fund Management FeeFund Advisor Contact PhoneFund Equalization Method
Fund Performance FeeFund CustodianFund Targeted Return
Fund Redemption FeeFund Custodian ContactFund Targeted Volatility

Hedge Fund Performance Data

This data set usually includes the following data points:

  • Fund RoR (rate of return) for the period, usually a month.
  • Fund NAV or unit price.
  • Fund AUM.

Hedge Fund Holdings Data

Holdings data is an optional dataset provided by only a select few vendors. The challenge lies in the confidentiality of holdings information, which fund managers typically do not disclose publicly. As a result, the majority of hedge funds do not offer holdings data. However, hedge fund investors may still access this information under a non-disclosure agreement (NDA). Another way of evaluating hedge fund risks based on their holdings is through holdings-based analysis offered by only a few analytic providers. In that case, the provider collects holdings data from hedge funds and conducts derived risk analysis for end clients. Although end clients don’t have direct access to hedge fund holdings, they can still assess investment risks calculated on holdings and not fund’s returns.

Hedge Fund Data Biases

Biases in fund databases pose a significant challenge for several reasons. Firstly, these databases typically cover only a portion of the entire hedge fund population, ranging from 30 percent to 50 percent. Secondly, biases can arise from the specific methodologies used to count hedge funds and their corresponding returns.

Survivorship bias. Hedge fund “Survivorship bias” refers to the tendency for hedge fund databases to only include funds that have survived over a certain period, typically excluding funds that have closed or performed poorly. This bias occurs because databases often do not account for funds that have ceased operations or liquidated due to underperformance. As a result, the performance data of surviving funds may appear better than it actually is, leading to an overestimation of hedge fund returns and potentially distorting investment decisions.

Selection bias. Hedge fund “Selection bias” refers to the distortion in performance data caused by the non-random selection of funds included in databases. This bias arises because hedge fund databases often only include funds that voluntarily report their performance or meet certain criteria, such as size or longevity. Funds that choose not to report or do not meet the database’s criteria are excluded, leading to a skewed representation of hedge fund performance. As a result, the performance data may not accurately reflect the broader universe of hedge funds, potentially influencing investment decisions based on incomplete or biased information.

Back reporting bias. Hedge fund “Back reporting bias” refers to the practice of retroactively adding successful funds to a database after their performance has been realized. This bias occurs when databases selectively include funds with favorable performance histories, thus distorting the overall performance data. Back reporting bias can create an artificially positive impression of a database’s track record by selectively highlighting successful funds while excluding less successful ones. As a result, investors may be misled into believing that the database’s historical performance is more favorable than it actually is, potentially leading to inaccurate investment decisions.

Instant history bias. Hedge fund “Instant history bias” refers to the phenomenon where newly launched funds are included in databases with historical performance data as soon as they begin reporting. This bias arises because databases may retroactively assign performance data to newly launched funds based on their actual performance since inception. However, this approach can create a misleading impression of a fund’s historical performance, as it does not account for the period before the fund began reporting. Instant history bias can distort the perception of a fund’s track record, potentially leading investors to make investment decisions based on incomplete or inaccurate information about a fund’s past performance.

Short Data Series

Limited data series are a common challenge in hedge fund analysis, typically consisting of around twenty to sixty monthly observations. This stands in contrast to stocks and bonds, which often have extensive return series. The short data series necessitate the use of analytical methods that can offer sufficient confidence despite limited data. For instance, applying the Conditional Value-at-Risk (CVaR) to hedge funds becomes problematic due to its reliance on a small number of historical observations, raising concerns about its accuracy and reliability.

In conclusion, hedge fund data presents a distinct set of challenges due to the unique nature of these assets. Controversy often surrounds hedge fund data, stemming from issues such as survivorship bias, selection bias, and limited data series. However, it is crucial for hedge fund investors to access clean and unbiased data to make informed investment decisions. By addressing these challenges and ensuring the integrity of the data, investors can better navigate the complexities of hedge fund investing and enhance their chances of success in the market.