Population Synthesis Reference List: Pre-2009 Arentze, T., Timmermans, H. J. P., Hofman, F

Müller, Kirill and Kay W. Axhausen

Download 136.12 Kb.
Size136.12 Kb.
1   2   3

Müller, Kirill and Kay W. Axhausen. Population Synthesis for Microsimulation: State of the Art, (2011) 21p.

Abstract - In agent-based microsimulation models for land use or transportation planning, agents' decisions are simulated over time in order to predict future states of the system. The initial step is the definition of agents -- e.g., persons and households. If a snapshot of the entire population of the study area, taken at the simulation's base year, were on hand, one could use this as an initial condition. Unfortunately, such data is often not available due to privacy and cost constraints. To tackle this issue, one can combine different data sources to derive a disaggregate representation of the agents, matching given criteria like correlation structure and control totals. This process is referred to as population synthesis. The authors summarize recent efforts to generate synthetic populations for microsimulation. All of the aforementioned studies share two tasks: (a) adjustment of an initial population, taken from a past census or other survey data, to current constraints, and (b) selecting households into the generated population. The authors describe the above tasks, and analyze and evaluate the characteristics of the particular approaches. This digest will hopefully be helpful for the implementation of future population synthesis routines.

Transportation Research Board 90th Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20110123 EndDate:20110127 Sponsors:Transportation Research Board - Transportation Research Board 90th Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20110123 EndDate:20110127 Sponsors:Transportation Research Board, Cation,

Müller, K., Axhausen, K.W., 2011. Hierarchical IPF: generating a synthetic population for Switzerland. Presented at the 4th European Regional Science Association conference, available from http://www.sustaincity.org/publications/SC_Hierarchical_IPF.pdf


Abraham, John E., Kevin J. Stefan and J. D. Hunt. Population Synthesis Using Combinatorial Optimization at Multiple Levels, (2012) 17p.

Abstract - With the increasing use of disaggregate models and microsimulation techniques, an important component for practitioners in the modelling field is the creation of a synthetic population, which is a disaggregate representation of the population of an area similar to the real population (current or future) and matching certain known or forecast distributions of attributes such as household size and income. This paper describes an approach using a combinatorial optimization algorithm; a versatile technique capable of simultaneously matching targets at multiple agent levels, such as properties of households as well as for individuals within the households. The software also supports simultaneous targets defined for multiple geographical levels (such as zones, counties and states). The use of the software is demonstrated in two applications; the synthesis of the 2000 population of California (comprising some 33.9 million individuals in 11.5 million households), and the synthesis of the ca. 2008 employment in Oregon and surrounding areas (comprising 3.5 million workers). The algorithm is acceptably fast and matches the targets with a high degree of accuracy.

Transportation Research Board 91st Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20120122 EndDate:20120126 Sponsors:Transportation Research Board - Transportation Research Board 91st Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20120122 EndDate:20120126 Sponsors:Transportation Research Board, Cation,

Goulias, Konstadinos G., Chandra R. Bhat, Ram M. Pendyala, Yali Chen, Rajesh Paleti, Karthik C. Konduri, Ting Lei, Seo Youn Yoon, Guoxiong Huang and Hsi-Hwa Hu. Simulator of Activities, Greenhouse Emissions, Networks, and Travel (Simagent) in Southern California, (2012) 25p.

Abstract - In this paper the authors describe the recently developed large scale spatio-temporal simulator of activities and travel for Southern California. The simulator includes population synthesis that recreates the entire resident population in this Mega region, provides locations for residences, workplaces, and schools for each person, estimates car ownership and type, and provides other key personal and household characteristics. Then, a synthetic schedule generator recreates for each resident person in the simulated region a schedule of activities and travel that reflects intra-household activity coordination for a day. These synthetic activity and travel daily schedules are then converted to multiple Origin Destination (OD) matrices at different times in a day. These are in turn combined with other OD matrices (representing truck travel, travel from and to ports and airports, and travel generated outside the region) and assigned to the network. The assignment output is then used in the software EMFAC to produce estimates of fuel consumed and pollutants emitted (including CO2) by different classes of vehicles. The overall model system also includes provision for finer spatial and temporal resolutions and a staged plan to implement them in TRANSIMS and MATSIM. Numerical examples from each major modeling group are also provided together with an outline of next steps.

Transportation Research Board 91st Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20120122 EndDate:20120126 Sponsors:Transportation Research Board - Transportation Research Board 91st Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20120122 EndDate:20120126 Sponsors:Transportation Research Board, Cation,

Harland K., Heppenstall A., Smith, D. & Birkin, M. (2012), Creating realistic synthetic populations at varying spatial scales: a comparative critique of population synthesis techniques, Journal of Artificial Societies and Social Simulation, 15(1) 1.

Kao, Shih-Chieh, Hoe Kyoung Kim, Cheng Liu, Xiaohui Cui and Budhendra L. Bhaduri. Dependence-Preserving Approach to Synthesizing Household Characteristics. Transportation Research Record: Journal of the Transportation Research Board, no. 2302 (2012): pp 192–200.

Abstract - One effective approach to study day-to-day traveler behavior is through the activity-based traffic demand model, in which all travelers are treated as individual agents and interact under a computation-intensive framework. Nevertheless, because of high survey costs, low response rate, and privacy concerns, detailed household and personal characteristics are usually unavailable. Various population synthesizers were therefore proposed to reconstruct a methodologically rigorous estimate of household characteristics from different surveys. For instance, the iterative proportional fitting (IPF) algorithm is used to synthesize the full population from the public use microdata sample (PUMS) and Census Summary File 3 (SF3) in the popular activity-based traffic demand model, TRANSIMS. However, some fundamental limitations of IPF (e.g., zero cells in the contingency table as a result of small sample size) have drawn sufficient attention and resulted in the development of enhanced IPF algorithms and other strategies. This paper proposes a copula-based method to synthesize household characteristics that preserves marginal distributions and dependence structure between variables. The proposed method is tested for the state of Iowa, and the results are compared with the IPF approach of TRANSIMS. The synthesized households resulted in the same local SF3 statistics at each block group. But having similar intervariable correlations as described in the PUMS suggests the applicability of the copula-based approach. Because marginal distributions and dependence structure can be faithfully preserved, the proposed method could be a suitable alternative to synthesize realistic agent characteristics for further activity-based traffic demand modeling. http://dx.doi.org/10.3141/2302-21
Ma, Lu and Sivaramakrishnan Srinivasan. Synthesizing Target-Year Populations for Input to Travel Demand Models, (2012) 24p.

Abstract - This study contributes by presenting an empirical assessment of target year populations synthesized with different base-year populations, data-fusion methods, and control tables. Twelve synthetic populations were synthesized for 12 census tracts in Florida. The empirical results indicate the value of synthesizing more accurate base-year populations by accommodating multi-level controls. The impact of the data fusion methodology applied in the target year context is more modest possibly because there are fewer control tables available in the target year. Finally, errors in the target year control tables significantly reduce the accuracy of the synthesized populations. The magnitude of the overall error in the synthesized population appears to be linearly related to the magnitude of the input errors introduced via the control tables. Overall, efforts to accurately synthesize base-year populations and obtain target-year controls can help synthesize good target-year populations.

Transportation Research Board 91st Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20120122 EndDate:20120126 Sponsors:Transportation Research Board - Transportation Research Board 91st Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20120122 EndDate:20120126 Sponsors:Transportation Research Board, Cation,

McCray, Danielle R., John S. Miller and Lester A. Hoel. Accuracy of Zonal Socioeconomic Forecasts for Travel Demand Modeling: Retrospective Case Study. Transportation Research Record: Journal of the Transportation Research Board, no. 2302 (2012): pp 148–156.

Abstract - Modeling for urban travel demand begins with 20-year forecasts of population, households, vehicle ownership, and employment for a region’s individual transportation analysis zones. Yet even though these models rely on socioeconomic forecasts, the long-term accuracy of such models has not received attention, especially for smaller regions with limited planning staff. This paper reports on a case study of the socioeconomic predictions made in 1980 for a horizon year of 2000, by comparing predicted and actual results in Lynchburg, Virginia. The region percentage error reflects the difference between forecast and observed values for the entire region. Although regional forecasts for the number of vehicles and employment showed errors of less than 10%, those forecasts for population and households showed errors of 48% and 14%, respectively. The failure of planned development in two of the region’s 68 zones accounted for much of this error, such that removal of these two zones lowered population and household region percentage errors to 10% and 1%, respectively. The zone percentage error is the average of all individual zone percentage errors. Even after removal of the two aforementioned zones, population, households, vehicles, and employment had strikingly large zone percent errors of 39%, 48%, 45%, and 136%, respectively. These results make a compelling case for executing the regional travel demand model twice: once with the given socioeconomic forecasts and once with forecasts modified on the basis of expected errors. For regions that have not conducted an assessment such as that presented here, the expected errors from this paper may be used. http://dx.doi.org/10.3141/2302-16

Otani, Noriko, Nao Sugiki, Varameth Vichiensan and Kazuaki Miyamoto. Modifiable Attribute Cell Problem and Solution Method for Population Synthesis in Land Use Microsimulation. Transportation Research Record: Journal of the Transportation Research Board, no. 2302 (2012): pp 157–163.

Abstract - Land use microsimulation requires the preparation of a set of microdata for the base year. Most existing procedures used for the synthesis of population data are based on the iterative proportional fitting method, in which the number of individuals in each cell of the cross-classification table is estimated. Such a procedure is referred to as the cell-based approach in this study. The approach is based on predefined categories of individuals. Originally, however, these individuals have continuous attributes. Therefore, a different type of categorization would yield a different classification table, which would change the end results of the analysis. In this paper, this phenomenon is referred to as the modifiable attribute cell problem (MACP). It is similar to the modifiable area unit problem that arises when spatial data are aggregated into zones. This paper addresses MACP and proposes a method to determine the best combination of the categories. The solution of MACP is considered to be the minimization of the number of cells in a table with respect to the key output variable that has been defined and used as an evaluation criterion. Because of the computational difficulty resulting from the combination explosion, symbiotic evolution, which is a kind of genetic algorithm, is used. Finally, a case study is presented for the Sapporo metropolitan area of Japan. http://dx.doi.org/10.3141/2302-17
Pendyala, Ram M., Chandra R. Bhat, Konstadinos G. Goulias, Rajesh Paleti, Karthik C. Konduri, Raghu Sidharthan, Hsi-Hwa Hu, Guoxiong Huang and Keith P. Christian. Application of Socioeconomic Model System for Activity-Based Modeling: Experience from Southern California. Transportation Research Record: Journal of the Transportation Research Board, no. 2303 (2012): pp 71–80.

Abstract - This paper presents results from the application of a comprehensive socioeconomic and demographic model system in conjunction with a continuous-time, activity-based microsimulation model of travel demand developed for the Southern California Association of Governments. The socioeconomic model system includes two major components. The first is a synthetic population generator that is capable of synthesizing a representative population for the entire region while controlling for both household- and person-level marginal distributions. The second is an econometric microsimulator that models various socioeconomic and demographic attributes for each person in the synthetic population with a view to developing a rich set of input data for the activity-based microsimulation model system. The results show that the socioeconomic model system is capable of replicating known distributions of demographic attributes in the population and can be easily scaled for implementation in large regions such as the Southern California area, which includes a population of more than 18 million people in its model boundaries. http://dx.doi.org/10.3141/2303-08
Pritchard, David R. and Eric J. Miller. Advances in Population Synthesis: Fitting Many Attributes Per Agent and Fitting to Household and Person Margins Simultaneously. Transportation: Planning, Policy, Research, Practice 39, no. 3 (2012): pp 685-704.

Abstract - Agent-based microsimulation models simulate the behavior of individual agents over time in order to forecast the future state of an aggregate system. They can be used to model transportation, land use or other socioeconomic processes. These models require an initial synthetic population derived from census data, which usually are created using the iterative proportional fitting (IPF) procedure. In this paper, a computational method is proposed that allows the synthesis of many more attributes and finer attribute categories than the IPF procedure. A novel conditional Monte Carlo synthesis procedure allows a simultaneous fit to household, family and person-level data. This permits a valid synthesis of relationships between agents for census data. The results of each new method are evaluated empirically in terms of goodness-of-fit. Although the proposed technique was developed to address limitations specific to Canadian census data, it could also be useful for data from other countries. http://dx.doi.org/10.1007/s11116-011-9367-4
Rich, Jeppe and Ismir Mulalic. Generating Synthetic Baseline Populations from Register Data. Transportation Research Part A: Policy and Practice 46, no. 3 (2012): pp 467-479.

Abstract - The paper presents a population synthesiser based on the method of Iterative Proportional Fitting (IPF) algorithm developed for the new Danish national transport model system. The synthesiser is designed for large population matrices and allows target variables to be represented in several target constraints. As a result, constraints for the IPF are cross-linked, which makes it difficult to ensure consistency of targets in a forecast perspective. The paper proposes a new solution strategy to ensure internal consistency of the population targets in order to guarantee proper convergence of the IPF algorithm. The solution strategy consists in establishing a harmonisation process for the population targets, which combined with a linear programming approach, is applied to generate a consistent target representation. The model approach is implemented and tested on Danish administrative register data. A test on historical census data shows that a 2006 population could be predicted by a 1994 population with an overall percentage deviation of 5-6% given that targets were known. It is also indicated that the deviation is approximately a linear function of the length of the forecast period. http://www.sciencedirect.com/science/article/pii/S0965856411001716
Rizi, S.M.M., Łatek, M.M., Geller, A., 2012. Fusing remote sensing with sparse demographic data for synthetic population generation: an algorithm and application to rural Afghanistan. International Journal of Geographical Information Science. International Journal of Geographical Information Science, iFirst, 1–19, DOI:10.1080/13658816.2012.734825.
Barthelemy, Johan and Philippe L. Toint. Synthetic Population Generation without a Sample. Transportation Science 47, no. 2 (2013): pp 266-279.

Abstract - The advent of microsimulation in the transportation sector has created the need for extensive disaggregated data concerning the population whose behavior is modeled. Because of the cost of collecting this data and the existing privacy regulations, this need is often met by the creation of a synthetic population on the basis of aggregate data. Although several techniques for generating such a population are known, they suffer from a number of limitations. The first is the need for a sample of the population for which fully disaggregated data must be collected, although such samples may not exist or may not be financially feasible. The second limiting assumption is that the aggregate data used must be consistent, a situation that is most unusual because these data often come from different sources and are collected, possibly at different moments, using different protocols. The paper presents a new synthetic population generator in the class of the Synthetic Reconstruction methods, whose objective is to obviate these limitations. It proceeds in three main successive steps: generation of individuals, generation of household type's joint distributions, and generation of households by gathering individuals. The main idea in these generation steps is to use data at the most disaggregated level possible to define joint distributions, from which individuals and households are randomly drawn. The method also makes explicit use of both continuous and discrete optimization and uses the x supra 2 metric to estimate distances between estimated and generated distributions. The new generator is applied for constructing a synthetic population of approximately 10,000,000 individuals and 4,350,000 households localized in the 589 municipalities of Belgium. The statistical quality of the generated population is discussed using criteria extracted from the literature, and it is shown that the new population generator produces excellent results. http://dx.doi.org/10.1287/trsc.1120.0408

Chingcuanco, Franco and Eric J. Miller. Demographic Microsimulation Model for Integrated Land Use, Transportation, and Environment Model System, (2013) 16p.

Abstract - The Integrated Land Use, Transportation, Environment (ILUTE) model system is an agent-based microsimulation model that dynamically evolves the urban spatial form, economic structure, demographics and travel behavior over time for the Greater Toronto-Hamilton Area (GTHA). This paper presents the ILUTE Demographic Updating Module (I-DUM), which updates the residential population demographics throughout the simulation. Given a synthetic base population, I-DUM updates the attributes of the agents at each time step. New agents are introduced through birth and in-migration, while agents exit through death and out-migration events. Unions between agents are formed through a marriage market, while a divorce model dissolves existing ones. Transitions to new households are also triggered by a move-out model. In addition to its comprehensive scope, I-DUM is a closed demographic model where social networks are built and maintained throughout the simulation. Maintaining social connections brings some advantages for modeling travel behavior and location choice. I-DUM is being tested against a twenty-year (1986-2006) period using a 100% synthetic GTHA population (4.1 million persons, 1.1 million families, 1.4 million households). The results are compared against historical observations across multiple dimensions. In general, I-DUM exhibits a strong performance, and the authors have confidence that it can maintain the validity of inputs to the other behavioral models in ILUTE. I-DUM's implementation has also been parallelized which brings significant performance improvements. Starting with over 6.5 million agents (which grows past 10 million), the simulation takes just under 10 minutes to complete a twenty-year run.

Transportation Research Board 92nd Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20130113 EndDate:20130117 Sponsors:Transportation Research Board - Transportation Research Board 92nd Annual MeetingTransportation Research BoardWashington,DC,USA StartDate:20130113 EndDate:20130117 Sponsors:Transportation Research Board, Cation,

Farooq, Bilal, Michel Bierlaire, Ricardo Hurtubia and Gunnar Flötteröd. Simulation Based Population Synthesis. Transportation Research Part B: Methodological 58, (2013): pp 243-263.

Abstract - Microsimulation of urban systems evolution requires synthetic population as a key input. Currently, the focus is on treating synthesis as a fitting problem and thus various techniques have been developed, including Iterative Proportional Fitting (IPF) and Combinatorial Optimization based techniques. The key shortcomings of these procedures include: (a) fitting of one contingency table, while there may be other solutions matching the available data (b) due to cloning rather than true synthesis of the population, losing the heterogeneity that may not have been captured in the microdata (c) over reliance on the accuracy of the data to determine the cloning weights (d) poor scalability with respect to the increase in number of attributes of the synthesized agents. In order to overcome these shortcomings, the authors propose a Markov Chain Monte Carlo (MCMC) simulation based approach. Partial views of the joint distribution of agent’s attributes that are available from various data sources can be used to simulate draws from the original distribution. The real population from Swiss census is used to compare the performance of simulation based synthesis with the standard IPF. The standard root mean square error statistics indicated that even the worst case simulation based synthesis (SRMSE = 0.35) outperformed the best case IPF synthesis (SRMSE = 0.64). The authors also used this methodology to generate the synthetic population for Brussels, Belgium where the data availability was highly limited. http://dx.doi.org/10.1016/j.trb.2013.09.012


Download 136.12 Kb.

Share with your friends:
1   2   3

The database is protected by copyright ©essaydocs.org 2022
send message

    Main page