We present a general Markovian framework for order book modeling. Through our approach, we aim at providing a tool enabling to get a better understanding of the price formation process and of the link between microscopic and macroscopic features of financial assets. To do so, we propose a new method of order book representation, and decompose the problem of order book modeling into two sub-problems: dynamics of a continuous-time double auction system with a fixed reference price; interactions between the double auction system and the reference price movements. State dependency is included in our framework by allowing the order flow intensities to depend on the order book state. Furthermore, contrary to most existing models, the impact of the order book updates on the reference price dynamics is not assumed to be instantaneous. We first prove that under general assumptions, our system is ergodic. Then we deduce the convergence towards a Brownian motion of the rescaled price process.
Reinforcement Learning (RL) has been applied to robotic arm control, which enables the agent to learn an effective policy to solve complex tasks. However, it requires constant interaction with the environment leading to low sample efficiency. In this paper, we propose a robotic arm control approach based on planning via lookahead search, which is a model-based RL algorithm to improve the sample efficiency. The approach builds an environment model in order to obtain the dynamics of the environment. Thus the model can be used to plan future actions by a tree-based search. The experiments show that our approach can solve the task of robotic arm control with less environmental samples.
We present a general Markovian framework for order book modeling. Through our approach, we aim at providing a tool enabling us to get a better understanding of the price formation process and of the link between microscopic and macroscopic features of financial assets. To do so, we propose a new method of order book representation, and decompose the problem of order book modeling into two subproblems: dynamics of a continuous-time double auction system with a fixed reference price; interactions between the double auction system and the reference price movements. State dependency is included in our framework by allowing order flow intensities to depend on the order book state. Furthermore, contrary to most existing models, the impact of the order book updates on the reference price dynamic is not assumed to be instantaneous. We first prove under general assumptions the ergodicity of our system. Then we deduce the convergence towards a Brownian motion of the rescaled price process.
This thesis is made of two connected parts, the first one about limit order book modeling and the second one about tick value effects. In the first part, we present our framework for Markovian order book modeling. The queue-reactive model is first introduced, in which we revise the traditional zero-intelligence approach by adding state dependency in the order arrival processes. An empirical study shows that this model is very realistic and reproduces many interesting microscopic features of the underlying asset such as the distribution of the order book. We also demonstrate that it can be used as an efficient market simulator, allowing for the assessment of complex placement tactics. We then extend the queue-reactive model to a general Markovian framework for order book modeling. Ergodicity conditions are discussed in details in this setting. Under some rather weak assumptions, we prove the convergence of the order book state towards an invariant distribution and that of the rescaled price process to a standard Brownian motion. In the second part of this thesis, we are interested in studying the role played by the tick value at both microscopic and macroscopic scales. First, an empirical study of the consequences of a tick value change is conducted using data from the 2014 Japanese tick size reduction pilot program. A prediction formula for the effects of a tick value change on the trading costs is derived and successfully tested. Then, an agent-based model is introduced in order to explain the relationships between market volume, price dynamics, bid-ask spread, tick value and the equilibrium order book state.
We present a general Markovian framework for order book modeling. Through our approach, we aim at providing a tool enabling to get a better understanding of the price formation process and of the link between microscopic and macroscopic features of financial assets. To do so, we propose a new method of order book representation, and decompose the problem of order book modeling into two sub-problems: dynamics of a continuous-time double auction system with a fixed reference price; interactions between the double auction system and the reference price movements. State dependency is included in our framework by allowing the order flow intensities to depend on the order book state. Furthermore, contrary to most existing models, the impact of the order book updates on the reference price dynamics is not assumed to be instantaneous. We first prove that under general assumptions, our system is ergodic. Then we deduce the convergence towards a Brownian motion of the rescaled price process.
We build an agent-based model for the order book with three types of market participants: informed trader, noise trader and competitive market makers. Using a Glosten-Milgrom like approach, we are able to deduce the whole limit order book (bid-ask spread and volume available at each price) from the interactions between the different agents. More precisely, we obtain a link between efficient price dynamic, proportion of trades due to the noise trader, traded volume, bid-ask spread and equilibrium limit order book state. With this model, we provide a relevant tool for regulators and market platforms. We show for example that it allows us to forecast consequences of a tick size change on the microstructure of an asset. It also enables us to value quantitatively the queue position of a limit order in the book.
High cost of environmental interaction and low data efficiency limit the development of reinforcement learning in robotic grasping. This paper proposes an end-to-end robotic grasping method based on offline reinforcement learning via sequence modeling. It considers the most recent n-step history to assist the agent in making decisions, where a predictive model learns to directly predict actions from raw image inputs. The experimental results show that our method can achieve higher grasping success rate with less training data than traditional reinforcement learning algorithms in offline setting.
Abstract Configuration risk management refers to the use of probabilistic safety assessment technology to calculate risk indicators and perform risk management based on the actual operation configuration of nuclear power plant. Since its development in 1990, this method has been widely used in nuclear power plants. It can help power plant personnel optimize maintenance plan scheduling, control the nuclear risk level of units and improve the safety of power plants. During the shutdown and outage of the nuclear power plant, a lot of work has to be performed, resulting in the centralized shutdown of system equipment, and the nuclear safety during this period has attracted the attention of the power plant. Therefore, in order to ensure that the nuclear risk of the unit shutdown and outage can be controlled, Sanmen Nuclear Power, as the world’s first AP1000 unit, actively performs configuration risk management and control during the shutdown, It is the first time in China to optimize the outage plan scheduling through the combination of qualitative and quantitative configuration risk assessment method, so as to comprehensively investigate and eliminate the potential risk sources. This paper will mainly introduce the development background of configuration risk management, and take Sanmen Nuclear Power AP1000 unit as the object to introduce the application and achievements of qualitative and quantitative configuration risk assessment in outage scheduling optimization. In addition, it will also make a comparative analysis of qualitative and quantitative configuration risk assessment techniques.
We propose a learning-based network for depth map estimation from multi-view stereo (MVS) images. Our proposed network consists of three sub-networks: 1) a base network for initial depth map estimation from an unstructured stereo image pair, 2) a novel refinement network that leverages both photometric and geometric information, and 3) an attentional multi-view aggregation framework that enables efficient information exchange and integration among different stereo image pairs. The proposed network, called A-TVSNet, is evaluated on various MVS datasets and shows the ability to produce high quality depth map that outperforms competing approaches. Our code is available at https://github.com/daiszh/A-TVSNet.