Sessões Plenárias

  • Professora Doutora Maria José Lombardia, Departamento de Matemáticas (Área de Estadística e I.O), Facultad de Informática, Corunha University

    Título: Estimation of labour force indicators in counties of Galicia using a multinomial mixed model
    Abstract: The aim is the estimation of small area labour force indicators like totals of employed and unemployed people and unemployment rates. Small area estimators of these quantities are derived from three multinomial logit mixed modes with correlated time and area random effects. Mean squared errors are used to measure the accuracy of the proposed estimators and they are estimated by bootstrap methods. The introduced methodology is applied to real data from the Spanish Labour Force Survey of Galicia.
    Key-words: Bootstrap, multinomial mixed models, small area estimation, unemployment totals, unemployment rates

  • Professor Doutor Maurizio Vichi, Faculty of Statistics, Sapienza University of Rome

  • Título: Statistical models for clustering data
    Abstract: New statistical methodologies for clustering data are presented. They have the common feature to define a parametric model for clustering data, which is estimated in a least-squares or maximum likelihood context and finally validated by information criteria or resampling methods. The presentation in divided in three parts: (i) single partitioning and hierarchical clustering of a set of observations, (ii) multi-partitioning of a set of observations and a set of variables and (iii) clustering longitudinal multivariate observations.
    Key-words: Cluster Analysis, Clustering longitudinal data, Multimode Clustering

  • Professora Doutora Ana M. Pires, Department of Mathematics and CEMAT, Instituto Superior Técnico, Lisboa

    Título: Data analysis in the large data era
    Abstract: The tremendous advancement of new technologies has brought along an authentic flood of information. Incredible volumes of data are being produced faster and faster, creating problems of storage, transmission, manipulation, analysis and visualization. Using a simplistic classification, three types of large data can be considered: (i) large number of observations, small number of variables; (ii) small number of observations, larger number of variables; (iii) large number of observations and of variables. In this presentation real data will be used to illustrate specific problems in those scenarios. The role of hypotheses testing and statistical modeling is analyzed, and it will be shown that they are in trouble even with the type (i), expected to be the simplest case that one may consider. The curse of dimensionality, one of the difficulties faced when analyzing data of type (ii), is also discussed. Finally, we will see how some less standard methods --- machine learning, nearest neighbor or resampling --- can be used in these extreme conditions to produce acceptable answers.
    Key-words: Curse of dimensionality, high-dimensional data, hypotheses tests, large data, machine learning methods, nearest-neighbor methods, resampling methods, statistical models.