A collection of tools to automatically pair forced-choice items and examine their measurement performance


Forced-choice (FC) tests are gaining researcher’s interest increasingly for its faking resistance when well-designed. Well-designed FC tests should often be characterized by items within a block measuring different latent traits, and items within a block having similar magnitude, or high inter-item agreement (IIA) in terms of their social desirability. Other scoring models may also require factor loading differences or item locations within a block to be maximized or minimized.

Either way, decision on which items should be assigned to the same block - item pairing - is a crucial issue in building a well-designed FC test, which is currently carried out manually. However, given that we often need to simultaneously meet multiple objectives, manual pairing will turn out to be impractical and even infeasible, especially when the number of latent traits and/or the number of items per trait become relatively large.

The R package autoFC is developed to address these difficulties and provides a tool for facilitating automatic FC test construction as well as evaluating measurement performance using simulation data. It offers users the functionality to:

  1. Include multiple criteria for pairing items into the same block, with user-specified weights and calculating functions for each criterion.

  2. Automatically optimize the target function combined from the multiple criteria and produce near-optimal item pairings that satisfy the user-defined criteria.

  3. Specify blueprints for the FC blocks (i.e., exact specification on how the block should be, for example, in terms of measured traits and keying) and build FC blocks that are aligned with the setups in the blueprints.

  4. Produce simulated responses to FC scales, based on the Thurstonian IRT model (Brown & Maydeu-Olivares, 2011), and estimate the Thurstonian IRT model using the simulated responses.

  5. Examine the empirical reliability and measurement precision of the resulting trait scores produced from the estimation model.

Users are allowed to create an FC test of any block size (e.g. Pairs, Triplets, Quadruplets) and they can produce simulated responses to FC scales in both MOLE (Most & Least like me) and RANK formats.


You can install autoFC from CRAN:


You can install the development version of autoFC from GitHub:



Below is a brief explanation of all functions provided by the initial version of autoFC.

  1. cal_block_energy() and cal_block_energy_with_iia() both calculate the total energy for a single item block, or a full FC test with multiple blocks, given a data frame of item characteristics. The latter function incorporates IIA metrics into energy calculation.
  1. make_random_block() takes in number of items and block size as input arguments and produces a test with blocks of randomly paired item numbers. Information about item characteristics is not required.

  2. get_iia() takes in item responses and a single item block (Or a full FC test with multiple blocks), then returns IIA metrics for each item block.

  3. sa_pairing_generalized() is the automatic pairing function which takes in item characteristics (and also individual responses for all items) and an initial FC test, then optimizes the energy of the test based on Simulated Annealing (SA) algorithm.

In the Feb, 2024 update, we added a lot more functions, including the following core ones:

  1. construct_blueprint() builds up exact specifications of the FC blocks (i.e., blueprints), which typically indicates the keying and measured traits of each item for each block. An additional matching criteria can also be set, indicating how well should the items be matched based on certain indicators using a pre-specified cutoff.

  2. build_scale_with_blueprint() takes in the blueprint that user built manually or through construct_blueprint() and automatically produces the paired FC blocks consistent with the specifications in the blueprint.

  1. get_simulation_matrices() produces simulated item and person parameters based on the Thurstonian IRT model, using the factor analysis results extracted from lavaan::cfa() or get_CFA_estimates().

  2. convert_to_TIRT_response(), get_TIRT_long_data(), fit_TIRT_model() are extensions to the various functions in the ThurstonianIRT package (Bürkner, 2019) which allow the simulated (using convert_to_TIRT_response()) or actual responses to FC scales to be processed and converted into long format (using get_TIRT_long_data()) and fitted using lavaan, Mplus or stan methods (using fit_TIRT_model()).

  3. RMSE_range(), plot_scores(), and empirical_reliability() for diagnostic purposes, examining the measurement accuracy of the trait scores produced from the TIRT model.

Detailed descriptions of all functions and other functions that are not listed here can be found in the manual and the help document of each function.

We also recently published a paper (Li et al., 2024) discussing the issues related to the development of forced-choice scales, which also includes a detailed tutorial on how to construct FC scales using these latest functionalities of the autoFC package. Users are also encouraged to refer to this paper for further details.


Brown, A., & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71(3), 460-502. Bürkner, P. C. (2019). thurstonianIRT: Thurstonian IRT models in R. Journal of Open Source Software, 4(42), 1662. Li, M., Zhang, B., Li, L., Sun, T., & Brown, A., (2024). Mixed-Keying or Desirability-Matching in the Construction of Forced-Choice Measures? An Empirical Investigation and Practical Recommendations. Organizational Research Methods.