Research Statistics

A New Model to Manage Every Correlation

21 Oct 2024, by Valentina Gatti

It was developed by Ascolani, Franzolini, Lijoi and Prünster, researchers at the Bocconi Institute for Data Science and Analytics (BIDSA)

In Bayesian statistics, i.e. the approach that enables the updating of knowledge about a phenomenon using probability measures, modeling the dependence between heterogeneous data is crucial. In fact, developing a model allows you to integrate different sources of data to improve the results of the analysis, avoiding conclusions being based only on a single sample. However, modeling this dependency can sometimes be very complicated. This happens especially in the case of complex models, as in the case of nonparametric Bayesian models. In fact, existing models are limited to modeling positive correlations between data from different sources: an appropriate hypothesis only when data collected from different sources tend to vary in the same direction.

Filippo Ascolani, Beatrice Franzolini, Antonio Lijoi, and Igor Prünster, researchers and professors at the Bocconi Institute for Data Science and Analytics (BIDSA), managed to overcome this limit, introducing a new model capable of managing any type of correlation in their paper “Nonparametric priors with full-range borrowing of information”. In detail, the study outlined a CRM model (Completely Random Measures) with Full-range Borrowing of Information (n-FuRBI). The model combines the flexibility of random series construction with the analytical tractability of CRMs. This is achieved thanks to a new concept, called hyper-tie, and represents a direct and simple measure of dependency.

The key idea of the new model by Ascolani and colleagues consists in the fact that the correlations between data collected from different sources are determined by the links between the latent parameters that generate them. In existing nonparametric models, the parameters corresponding to two observations collected from two different sources, which can coincide or be independent. In this new model, they can be dependent even without necessarily coinciding. This new latent structure allows them to obtain more flexible models, which also allow negative correlation between different data sources.

The model was tested by researchers on both simulated and real data. In the latter case, it was used to predict stock and bond returns and to group students into clusters based on their results on certain tests. The new model showed superior performance compared to other existing methods, providing more accurate predictions and more precise clustering ability, even in the presence of missing data.

In terms of predictions, the n-FuRBI model offers greater flexibility, being able to incorporate both positive and negative relationships between different sources. This allows more precise estimates to be made even in complex scenarios, where the variables do not behave homogeneously. Finally, n-FuRBI models also allow for a variety of interesting extensions. In fact, such models can be seen as effective building blocks for modeling non-trivial dependency relationships in the case of more complex data analyses.

IGOR PRUENSTER

Bocconi University

Department of Decision Sciences

ANTONIO LIJOI

Bocconi University

Department of Decision Sciences

A New Model to Manage Every Correlation

IGOR PRUENSTER

ANTONIO LIJOI

Research

Four Bocconi Projects Funded by the FIS Program

The Brain's Balancing Act: How Vision and Movement Intertwine in the Mind's Eye

Dynamic Reasoning Expands What Society Can Achieve

Design as the Invisible Driver of Growth

Regulatory Complexity Is Not Always a Bad Thing

Marc Mézard Receives NeurIPS 2025 Award

How Corporate Decisions Drive the Economy

Why Do We Obsess With Divisive Issues?

The Algorithm That Optimizes Real-Time Decisions

Does Teacher Personality Limit the Educational Future of Disadvantaged Students?