<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Posts | Yaohan Chen</title>
    <link>/post/</link>
      <atom:link href="/post/index.xml" rel="self" type="application/rss+xml" />
    <description>Posts</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><lastBuildDate>Sun, 01 Mar 2026 00:00:00 +0000</lastBuildDate>
    <image>
      <url>/images/icon_hu74a6678d1d6008a500a19274577927da_17459_512x512_fill_lanczos_center_3.png</url>
      <title>Posts</title>
      <link>/post/</link>
    </image>
    
    <item>
      <title>Fast Introduction to Claude and GLM</title>
      <link>/post/claude_glm/</link>
      <pubDate>Sun, 01 Mar 2026 00:00:00 +0000</pubDate>
      <guid>/post/claude_glm/</guid>
      <description>&lt;p&gt;
&lt;a href=&#34;./claude_code_teaching_note_en.html&#34;&gt;Claude and GLM&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>MCMC, Spot Volatility, and Strategic Value of Information</title>
      <link>/post/mcmc_spot_volatility/</link>
      <pubDate>Tue, 01 Mar 2022 00:00:00 +0000</pubDate>
      <guid>/post/mcmc_spot_volatility/</guid>
      <description>&lt;h2 id=&#34;spot-volatility&#34;&gt;Spot Volatility&lt;/h2&gt;
&lt;p&gt;This is the major motivation for applying MCMC to smooth the noise part associated with the nonparametric estimation of spot volatility by fixing the local estimation window size. Traditionally, to obtain a relatively satisfactory estimation of spot volatility using nonparametric econometric tools, we need the asymptotic scheme, which is to some extent not that easy to be satisfied in practice for high-frequency trading since a lot of technical issues would be involved when we want to collect and sample data at high-frequency but maintain the number of observations reasonably large enough. This is the major motivation for applying MCMC to smooth noise part associated with the nonparametric estimation of spot volatility by fixing the local estimation window size 
&lt;a href=&#34;https://qeconomics.org/ojs/index.php/qe/article/view/1595&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;(Bollerslev, Li and Liao ,2021)&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;volatility-and-strategic-value-of-information&#34;&gt;Volatility and Strategic Value of Information&lt;/h2&gt;
&lt;p&gt;Kyle&amp;rsquo;s lambda as the measure bridging volatility and value of private information was intially proposed in 
&lt;a href=&#34;https://www.jstor.org/stable/1913210&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Kyle (1985)&lt;/a&gt;, later extended to continuous-timing setting in 
&lt;a href=&#34;https://academic.oup.com/rfs/article-abstract/5/3/387/1576252&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Back (1992)&lt;/a&gt; and more recently discussed in 
&lt;a href=&#34;https://oxford.universitypressscholarship.com/view/10.1093/acprof:oso/9780190241148.001.0001/acprof-9780190241148&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Back (2017)&lt;/a&gt;. For more technical details and comprehensive discussions, please refer to the literature listed.&lt;/p&gt;
&lt;h2 id=&#34;idea-is-simple-regressionbayesian-&#34;&gt;Idea is simple: Regression+Bayesian !!!&lt;/h2&gt;
&lt;p&gt;For Kyle&amp;rsquo;s lambda, using following regression to back out the slope as the target Kyle&amp;rsquo;s lambda.
$$
r_{t i}=\lambda_{t} y_{ti}+\varepsilon_{ti}
$$
where $r_{ti}$ refers to log return of asset over timing interval $i$ at date $t$:
$$
r_{t i}=p_{t \tau_{i}}-p_{t \tau_{i-1}}=\ln P_{t \tau_{i}}-\ln P_{t \tau_{i-1}}.
$$
and $y_{ti}=Y_{t\tau_{i}}-Y_{t\tau_{i-1}}$ refers to order flow over interval $i$ at $t$.
Besides, it can be demonstrated that strategic value of information can be characterized as
$$
\Omega=\mathbb{E}[J(0, \bar{v}, \tilde{v})]=\frac{\sigma_{v}^{2}}{\lambda} P_{0}
$$
with $J(t,p,v)$ as the characterizing HJB equation specifying the scheme maximizing accumulated wealth dynamically:
$$
J(t, p, v)=\frac{p-v+v(\ln v-\ln p)}{\lambda}+\frac{1}{2} \sigma_{v} \sigma_{z}(1-t) v.
$$
Consequently, it is possible for us to estimate strategic value of information as following
$$
\hat{\Omega}&lt;em&gt;{t}=\frac{\hat{\sigma}&lt;/em&gt;{t}^{2}}{\hat{\lambda}&lt;em&gt;{t}} P&lt;/em&gt;{t-1}
$$
by replacing $\hat{\lambda}_{t}$ with slope associated with univariate regression above (Regression) and smoothed Volatility using MCMC (Bayesian).&lt;/p&gt;
&lt;h2 id=&#34;is-simple-regression-valid--almost-surely-&#34;&gt;Is Simple Regression Valid ? Almost Surely !!!&lt;/h2&gt;





  
  











&lt;figure id=&#34;figure-daily-kyles-lambda-in-november-2018&#34;&gt;


  &lt;a data-fancybox=&#34;&#34; href=&#34;/post/mcmc_spot_volatility/kyles_lambda_reg_2018_huf5dee2dda54b7af16247d30bbc5ee86b_175026_2000x2000_fit_lanczos_3.png&#34; data-caption=&#34;Daily Kyle&amp;amp;rsquo;s Lambda in November 2018&#34;&gt;


  &lt;img data-src=&#34;/post/mcmc_spot_volatility/kyles_lambda_reg_2018_huf5dee2dda54b7af16247d30bbc5ee86b_175026_2000x2000_fit_lanczos_3.png&#34; class=&#34;lazyload&#34; alt=&#34;&#34; width=&#34;1040&#34; height=&#34;749&#34;&gt;
&lt;/a&gt;


  
  
  &lt;figcaption data-pre=&#34;Figure &#34; data-post=&#34;:&#34; class=&#34;numbered&#34;&gt;
    Daily Kyle&amp;rsquo;s Lambda in November 2018
  &lt;/figcaption&gt;


&lt;/figure&gt;






  
  











&lt;figure id=&#34;figure-daily-kyles-lambda-from-2004-to-2020&#34;&gt;


  &lt;a data-fancybox=&#34;&#34; href=&#34;/post/mcmc_spot_volatility/all_Kyles_lambda_hu79904fb68867ee38dc92442b2404b310_66676_2000x2000_fit_lanczos_3.png&#34; data-caption=&#34;Daily Kyle&amp;amp;rsquo;s Lambda from 2004 to 2020&#34;&gt;


  &lt;img data-src=&#34;/post/mcmc_spot_volatility/all_Kyles_lambda_hu79904fb68867ee38dc92442b2404b310_66676_2000x2000_fit_lanczos_3.png&#34; class=&#34;lazyload&#34; alt=&#34;&#34; width=&#34;952&#34; height=&#34;756&#34;&gt;
&lt;/a&gt;


  
  
  &lt;figcaption data-pre=&#34;Figure &#34; data-post=&#34;:&#34; class=&#34;numbered&#34;&gt;
    Daily Kyle&amp;rsquo;s Lambda from 2004 to 2020
  &lt;/figcaption&gt;


&lt;/figure&gt;

</description>
    </item>
    
    <item>
      <title>By far which anomalies (characteristics) universe we have to make to our research live in</title>
      <link>/post/whichuniverse/</link>
      <pubDate>Sat, 19 Feb 2022 00:00:00 +0000</pubDate>
      <guid>/post/whichuniverse/</guid>
      <description>&lt;h2 id=&#34;anomalies-characteristics-available&#34;&gt;Anomalies (Characteristics available)&lt;/h2&gt;
&lt;p&gt;There are currently several standard available datasets and codes for constructing the existing documented firm-level characteristics and anomalies, including 
&lt;a href=&#34;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2262374&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green et al. (2017)&lt;/a&gt;, 
&lt;a href=&#34;https://dachxiu.chicagobooth.edu/download/ML.pdf&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Gu et al. (2019)&lt;/a&gt;, 
&lt;a href=&#34;https://sites.google.com/site/serhiykozak/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Kozak et al. (2020)&lt;/a&gt; and 
&lt;a href=&#34;https://sites.google.com/site/chenandrewy/open-source-ap?authuser=0&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Chen and Zimmermann (2020)&lt;/a&gt;. To my limited research experience, among which the work done by 
&lt;a href=&#34;https://www.openassetpricing.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Chen and Zimmermann (2021)&lt;/a&gt; is by far the most comprehensive one providing the detailed information about firm-level characteristics and the associated portfolios constructed by sorting on these characteristics. I just noted this excellent research project when I was working on one of my researches recently. For detailed description, please refer to the website link where the authors have generoulsy shared their codes and data. I do appreciate their efforts on this project and hopefully they can constantly maintain this project, which will surely bring much convenience to other researchers. I also shared my codes for simple visualization of this data along with some brief descriptions. Besides, 
&lt;a href=&#34;https://www.lhpedersen.com/research&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Jensen, Kelly and Pedersen (2021)&lt;/a&gt; also provides a comprehensive analysis of characteristic universe around the world. &lt;br&gt;

&lt;a href=&#34;visualize_anom_publication_effect.R&#34;&gt;[Code]&lt;/a&gt; 
&lt;a href=&#34;anom_publication_demo.html&#34;&gt;[Demo]&lt;/a&gt; 
&lt;a href=&#34;VBC_Causal_Notes.pdf&#34;&gt;[Note]&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Variational Bayes Dynamic Variable Selection based on Rcpp and Armadillo</title>
      <link>/post/vbdvsrcpp/</link>
      <pubDate>Sun, 04 Oct 2020 00:00:00 +0000</pubDate>
      <guid>/post/vbdvsrcpp/</guid>
      <description>&lt;h2 id=&#34;brief-discussion&#34;&gt;Brief Discussion&lt;/h2&gt;
&lt;p&gt;In this section, I would like to partly share my replication of Monte Carlo experiment implemented in Koop, Gary, and Dimitris Korobilis. 2020. 
&lt;a href=&#34;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3246472&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;“Bayesian Dynamic Variable Selection in High Dimensions”&lt;/a&gt;. While my implementaion is based on Rcpp and Armadillo so that this optimized hybrid codes can significantly improve the computational efficiency. I post both the package in development and the quick demo here for reference.

&lt;a href=&#34;/CODES/vbdvsarmadillo_1.0.tar.gz&#34;&gt;[Code]&lt;/a&gt;

&lt;a href=&#34;vbdvsarmadillo_demo.html&#34;&gt;[Demo]&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>We are chasing perfectness, but we have to accept flaws</title>
      <link>/post/motto/</link>
      <pubDate>Sun, 27 Sep 2020 00:00:00 +0000</pubDate>
      <guid>/post/motto/</guid>
      <description></description>
    </item>
    
    <item>
      <title>Which Anomalies Matter for Portfolio Construction</title>
      <link>/post/hf_anomalies_portfolio/</link>
      <pubDate>Fri, 19 Jun 2020 00:00:00 +0000</pubDate>
      <guid>/post/hf_anomalies_portfolio/</guid>
      <description>&lt;h2 id=&#34;main-algorithm-used-for-checking-variable-importance&#34;&gt;Main Algorithm used for Checking Variable Importance&lt;/h2&gt;
&lt;p&gt;Our analysis is based on Random Forest while using the following &lt;strong&gt;permutation check&lt;/strong&gt; algorithm for checking the variable importance&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;VI_check_iml.png&#34; alt=&#34;alternative text for search engines&#34;&gt;&lt;/p&gt;
&lt;p&gt;Specifically, for the adopted machine learning method, which is denoted by $f$ and $f$ refers to Random Forest for our application. $\boldsymbol{X}$ refers to the normalized anomaly variables and $\boldsymbol{y}$ corresponds to the ownership of Hedge Fund portfolio.&lt;/p&gt;
&lt;h2 id=&#34;local-effects-of-anomalies&#34;&gt;Local Effects of Anomalies&lt;/h2&gt;
&lt;p&gt;
&lt;a href=&#34;#figure-local-effects-of-anomalies&#34;&gt;Following graph&lt;/a&gt; just demonstrates local effect of the most influential anomalies on Hedge fund ownership,&lt;/p&gt;





  
  











&lt;figure id=&#34;figure-local-effects-of-anomalies&#34;&gt;


  &lt;a data-fancybox=&#34;&#34; href=&#34;/post/hf_anomalies_portfolio/anomalies_local_effect_hu54d91a2bab3bb3210b563591de8feb7d_86995_2000x2000_fit_q90_lanczos.jpg&#34; data-caption=&#34;Local Effects of Anomalies&#34;&gt;


  &lt;img data-src=&#34;/post/hf_anomalies_portfolio/anomalies_local_effect_hu54d91a2bab3bb3210b563591de8feb7d_86995_2000x2000_fit_q90_lanczos.jpg&#34; class=&#34;lazyload&#34; alt=&#34;&#34; width=&#34;984&#34; height=&#34;955&#34;&gt;
&lt;/a&gt;


  
  
  &lt;figcaption data-pre=&#34;Figure &#34; data-post=&#34;:&#34; class=&#34;numbered&#34;&gt;
    Local Effects of Anomalies
  &lt;/figcaption&gt;


&lt;/figure&gt;

</description>
    </item>
    
  </channel>
</rss>
