There has been increasing attention to virtual real- ity applications in recent years, especially to immersive or 360-degree videos that typically consume much more bandwidth than traditional ones. Though all produced data is transferred, only a small part (denoted as Field of View or viewport) is watched by users due to the nature of immersive videos. Obviously, this causes a large waste of network resources. Hence, it is important to define a viewport-dependent streaming transmission strategy by detecting where the user is gazing and the movement of the user’s head. Unfortunately, there are few datasets providing this information. In this paper, we propose a tile-based simulation approach to generate the distribution of the user’s behavior and to provide information that can be used to optimize future view-dependent streaming protocols. We first characterize the users’ viewport pattern from datasets gathered from real users by decomposing the 360-degree stream into tiles and analyzing the frequency and time-interval distribution for each tile. Then, we devise a hierarchical Markov model that incorporates the beta distribution of each tile time interval to predict tile transition. The results show that the simulation tool characterizes the tile sequences of users accurately, performing close to the empirical results.