2 Migration from Poland to Norway
2.4 Migration decision making process
2.4.2 Experimental migration
Assim como no FCM, um dos principais problemas encontrados nos métodos adap- tados para dados intervalares é a dificuldade de particionar dados com alta sobreposição, pois, dados os centros vi e vj para algum i 6= j, a distância entre eles pode conter zero,
02 d(vi, vj). Assim, muitos objetos podem ter distância contendo zero para vários cen-
tros, e a pertinência desses objetos a tais centros é [0, 1], sendo mapeadas no mesmo valor de pertinência pontual. Esse fato, também pode ser influenciando pela escolha dos protótipos para a partição inicial.
Dito isso, nestes casos o número de execuções do algoritmo é alto, além da qualidade das partições obtidas diminuírem. Assim, é importante o estudo de novas técnicas capazes de manipular essas sobreposições bem como a escolha da partição inicial.
A escolha da ordem para comparar intervalo é outro fator que deve ser analisado. O algoritmo IbckM em cada iteração calcula para cada objeto o máximo de um conjunto com pertinências intervalares. Apesar da sugestão do uso de ordens admissíveis feita nesta dissertação, a escolha da ordem deve impactar os grupos gerados. Além disso, o maior impacto da ordem está na avaliação dos resultados, desde que critério para avaliar o agrupamento seja do tipo intervalo. Desse modo, a ordem pode influenciar na busca pela partição que otimiza um critério e levar a resultados diferentes para cada ordem.
Por fim, a aritmética clássica de Moore para intervalos foi um dos obstáculo para a abordagem desta dissertação. Algumas operações causam o aumento da imprecisão. Além
disso, algumas operações básicas não possuem propriedades desejáveis, por exemplo: dado X 2 I(R) temos que X − X 6= [0, 0]. O estudo da viabilidade de outras aritméticas para intervalos no contexto de i-métricas para métodos de agrupamento deve ser considerado. O resultado obtido nesta dissertação é fruto da colaboração de vários pesquisadores do Departamento de Informática e Matemática Aplicada (UFRN) e pode, também, servir para motivação para outros pesquisadores.
Referências
ACIOLY, B. M. Fundamentação Computacional da Matemática Intervalar. Tese (Doutorado) — Universidade Federal do Rio Grande do Sul– Instituto de Informática, 1991.
AFONSO, F.; BILLARD, L.; DIDAY, E. Symbolic linear regression with taxonomies. In: Classification, Clustering, and Data Mining Applications. [S.l.]: Springer, 2004. p. 429–437.
ALMEIDA, C. W. de; SOUZA, R. M. de; CANDEIAS, A. L. Fuzzy kohonen clustering networks for interval data. Neurocomputing, v. 99, n. 0, p. 65 – 75, 2013.
ANDERSON, D. et al. Comparing fuzzy, probabilistic, and possibilistic partitions. Fuzzy Systems, IEEE Transactions on, v. 18, n. 5, p. 906–918, Oct 2010.
ARNDT, H.-R. On interval systems [x]=[a][x]+[b] and the powers of interval matrices in complex interval arithmetics. Reliable computing, Springer, v. 13, n. 3, p. 245–259, 2007. ATANASSOV, K. T. Intuitionistic fuzzy sets. Fuzzy sets and Systems, Elsevier, v. 20, n. 1, p. 87–96, 1986.
BASHON, Y.; NEAGU, D.; RIDLEY, M. J. A framework for comparing heterogeneous objects: on the similarity measurements for fuzzy, numerical and categorical attributes. Soft Computing, Springer, p. 1–21, 2013.
BERKHIN, P. A survey of clustering data mining techniques. In: Grouping multidimensional data. [S.l.]: Springer, 2006. p. 25–71.
BERTRAND, P.; GOUPIL, F. Descriptive statistics for symbolic data. In: Analysis of symbolic data. [S.l.]: Springer, 2000. p. 106–124.
BEZDEK, J.; PAL, N. Some new indexes of cluster validity. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, IEEE, v. 28, n. 3, p. 301–315, 1998.
BEZDEK, J. C. Pattern recognition with fuzzy objective function algorithms. [S.l.]: Pleneum Press, New York, 1981.
BEZDEK, J. C.; PAL, S. K. Fuzzy models for pattern recognition. [S.l.]: IEEE press New York, 1992.
BILLARD, L.; DIDAY, E. Symbolic Data Analysis: Conceptual Statistics and Data Mining. [S.l.]: John Wiley & Sons, Ltd, 2006. 321 p.
BOCK, H. H. Clustering algorithms and kohonen maps for symbolic data(symbolic data analysis). Journal of the Japanese Society of Computational Statistics, Japanese Society of Computational Statistics, v. 15, n. 2, p. 217–229, 2003.
BUSTINCE, H. et al. Generation of linear orders for intervals by means of aggregation functions. Fuzzy Sets and Systems, v. 220, p. 69–77, 2013.
CABANES, G. et al. A new topological clustering algorithm for interval data. Pattern Recognition, v. 46, n. 11, p. 3030 – 3039, 2013.
CARUSO, C.; MALERBA, D.; PAPAGNI, D. Learning the daily model of network traffic. In: Foundations of Intelligent Systems. [S.l.]: Springer, 2005. p. 131–141.
CARVALHO, F. D. Histograms in symbolic data analysis. Annals of Operations Research, Springer, v. 55, n. 2, p. 299–322, 1995.
CARVALHO, F. d. A. de; BRITO, P.; BOCK, H.-H. Dynamic clustering for interval data based on l2 distance. Computational Statistics, Springer, v. 21, n. 2, p. 231–250, 2006. CARVALHO, F. de A.T. de. Fuzzy c-means clustering methods for symbolic interval data. Pattern Recognition Letters, v. 28, n. 4, p. 423 – 437, 2007.
CARVALHO, F. de A.T. de et al. Adaptive hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recognition Letters, v. 27, n. 3, p. 167 – 179, 2006. CARVALHO, F. de A.T. de; TENORIO, C. P. Fuzzy k-means clustering algorithms for interval-valued data based on adaptive quadratic distances. Fuzzy Sets and Systems, v. 161, n. 23, p. 2978 – 2999, 2010.
CHAKRABORTY, C.; CHAKRABORTY, D. A theoretical development on a fuzzy distance measure for fuzzy numbers. Mathematical and Computer Modelling, Elsevier, v. 43, n. 3, p. 254–261, 2006.
CHAVENT, M.; LECHEVALLIER, Y. Dynamical clustering of interval data: Optimization of an adequacy criterion based on hausdorff distance. In: JAJUGA, K.; SOKOLOWSKI, A.; BOCK, H.-H. (Ed.). Classification, Clustering, and Data Analysis. [S.l.]: Springer Berlin Heidelberg, 2002, (Studies in Classification, Data Analysis, and Knowledge Organization). p. 53–60.
CIAMPI, A. et al. Growing a tree classifier with imprecise data. Pattern Recognition Letters, Elsevier, v. 21, n. 9, p. 787–803, 2000.
COMBA, J. L. D.; STOLFI, J. Affine arithmetic and its applications to computer graphics. In: Proceedings of VI SIBGRAPI (Brazilian Symposium on Computer Graphics and Image Processing). [S.l.: s.n.], 1993. p. 9–18.
DAVE, R. N. Validating fuzzy partitions obtained through c-shells clustering. Pattern Recognition Letters, v. 17, n. 6, p. 613 – 623, 1996.
DAVIES, D. L.; BOULDIN, D. W. A cluster separation measure. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-1, n. 2, p. 224–227, 1979.
DIDAY, E. Introduction à l’approche symbolique en analyse des données. RAIRO. Recherche opérationnelle, EDP Sciences, v. 23, n. 2, p. 193–236, 1989.
DIDAY, E. Des objets de l’analyse des données à ceux de l’analyse des connaissances. Induction Symbolique et Numérique à partir de données, Kodratoff Y. et Diday E. Eds., CEPADUES, 1991.
DIDAY, E.; BRITO, M. P. Symbolic cluster analysis. In: Conceptual and Numerical Analysis of Data. [S.l.]: Springer, 1989. p. 45–84.
DIDAY, E.; NOIRHOMME-FRAITURE, M. Symbolic Data Analysis and the SODAS Software. New York, NY, USA: Wiley-Interscience, 2008.
DONGEN, S. Performance Criteria for Graph Clustering and Markov Cluster Experiments. Amsterdam, The Netherlands, The Netherlands, 2000.
ESPOSITO, F.; MALERBA, D.; APPICE, A. Dissimilarity and matching. Symbolic Data Analysis and the SODAS Software, John Wiley & Sons, Ltd, p. 61–66, 2008. FOWLKES, E. B.; MALLOWS, C. L. A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, v. 78, n. 383, p. 553–569, 1983.
FRIEDMAN, J.; HASTIE, T.; TIBSHIRANI, R. The elements of statistical learning: Data mining, inference, and prediction. Springer Series in Statistics (, New York, NY: Springer-Verlag New York, 2009.
FUKUYAMA, Y.; SUGENO, M. A new method of choosing the number of clusters for the fuzzy c-means method. In: Proc. 5th Fuzzy Syst. Symp. [S.l.: s.n.], 1989. v. 247. GOLLI, A. E.; CONAN-GUEZ, B.; ROSSI, F. Self-organizing maps and symbolic data. Journal of Symbolic Data Analysis, Seconda Universitá degli studi di Napoli, v. 2, n. 1, 2004.
GOWDA, K.; RAVI, T. Divisive clustering of symbolic objects using the concepts of both similarity and dissimilarity. Pattern Recognition, Elsevier, v. 28, n. 8, p. 1277–1282, 1995.
GOWDA, K. C.; RAVI, T. Agglomerative clustering of symbolic objects using the concepts of both similarity and dissimilarity. Pattern Recognition Letters, Elsevier, v. 16, n. 6, p. 647–652, 1995.
GURU, D.; KIRANAGI, B. B.; NAGABHUSHAN", P. Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns. Pattern Recognition Letters, v. 25, n. 10, p. 1203 – 1213, 2004.
GUYON, I.; LUXBURG, U. V.; WILLIAMSON, R. C. Clustering: Science or art. In: NIPS 2009 Workshop on Clustering Theory. [S.l.: s.n.], 2009.
HAJJAR, C.; HAMDAN, H. Interval data clustering using self-organizing maps based on adaptive mahalanobis distances. Neural Networks, v. 46, n. 0, p. 124 – 132, 2013.
HARDY, A. Validation of clustering structure: determination of the number of clusters. Symbolic Data Analysis and the Sodas Software, p. 235–262, 2008.
HATHAWAY, R. J.; BEZDEK, J. C. Optimization of clustering criteria by reformulation. Fuzzy Systems, IEEE Transactions on, IEEE, v. 3, n. 2, p. 241–245, 1995.
HAVENS, T.; BEZDEK, J.; PALANISWAMI, M. Cluster validity for kernel fuzzy clustering. In: Fuzzy Systems (FUZZ-IEEE), 2012 IEEE International Conference on. [S.l.: s.n.], 2012. p. 1–8.
HUBERT, L.; ARABIE, P. Comparing partitions. Journal of Classification, Springer-Verlag, v. 2, n. 1, p. 193–218, 1985.
HUBERT, L. J.; LEVIN, J. R. A general statistical framework for assessing categorical clustering in free recall. Psychological Bulletin, American Psychological Association, v. 83, n. 6, p. 1072, 1976.
HUKUHARA, M. Intégration des applications mesurables dont la valeur est un compact convexe. Funkcial. Ekvac, v. 10, p. 205–223, 1967.
IRPINO, A. "spaghetti"pca analysis: An extension of principal components analysis to time dependent interval data. Pattern recognition letters, Elsevier, v. 27, n. 5, p. 504–513, 2006.
IRPINO, A.; LAURO, C.; VERDE, R. Visualizing symbolic data by closed shapes. In: Between Data Science and Applied Data Analysis. [S.l.]: Springer, 2003. p. 244–251. IRPINO, A.; VERDE, R. Dynamic clustering of interval data using a wasserstein-based distance. Pattern Recognition Letters, v. 29, n. 11, p. 1648 – 1658, 2008.
JACCARD, P. Etude comparative de la distribution florale dans une portion des Alpes et du Jura. [S.l.]: Impr. Corbaz, 1901. 547-579 p.
JAIN, A. K.; MURTY, M. N.; FLYNN, P. J. Data clustering: a review. ACM computing surveys (CSUR), ACM, v. 31, n. 3, p. 264–323, 1999.
JENHANI, I.; AMOR, N. B.; ELOUEDI, Z. Decision trees as possibilistic classifiers. International Journal of Approximate Reasoning, Elsevier, v. 48, n. 3, p. 784–807, 2008. KIM, M.; RAMAKRISHNA, R. New indices for cluster validity assessment. Pattern Recognition Letters, Elsevier, v. 26, n. 15, p. 2353–2363, 2005.
KOHONEN, T.; HUANG, T.; SCHROEDER, M. Self-Organizing Maps. [S.l.]: Springer, Heidelberg, 2000.
KRISHNAPURAM, R.; KELLER, J. M. A possibilistic approach to clustering. Fuzzy Systems, IEEE Transactions on, IEEE, v. 1, n. 2, p. 98–110, 1993.
KULISCH, U. W.; MIRANKER, W. L. Computer arithmetic in theory and practice. [S.l.]: Academic Press, Inc., 1981.
LAURO, N. C.; VERDE, R.; IRPINO. Principal component analysis of symbolic data described by intervals. Symbolic Data Analysis and the SODAS Software, Wiley Chichester, p. 279–311, 2008.
LAURO, N. C.; VERDE, R.; PALUMBO, F. Factorial Data Analysis on Symbolic Objects under cohesion constrains in: Data Analysis, Classification and related methods. [S.l.]: Springer-Verlag, Heidelberg, 2000.
LE-RADEMACHER, J.; BILLARD, L. Principal component histograms from interval-valued observations. Computational Statistics, Springer, p. 1–22, 2013.
LECHEVALLIER, Y.; GOLLI, A. E.; HÉBRAIL, G. Improved generation of symbolic objects from relational databases. In: . Symbolic Data Analysis and the SODAS Software. [S.l.]: John Wiley & Sons, Ltd, 2008. p. 43–59.
MACQUEEN, J. et al. Some methods for classification and analysis of multivariate observations. In: CALIFORNIA, USA. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. [S.l.], 1967. v. 1, n. 281-297, p. 14.
MAULIK, U.; BANDYOPADHYAY, S. Performance evaluation of some clustering algorithms and validity indices. Pattern Analysis and Machine Intelligence, IEEE Transactions on, IEEE, v. 24, n. 12, p. 1650–1654, 2002.
MIRKIN, B. Mathematical classification and clustering: From how to what and why. [S.l.]: Springer, 1998.
MOORE, R. Methods and applications of interval analysis. [S.l.]: SIAM, 1979.
MOORE, R. E. Interval Arithmetic and Automatic Error Analysis in Digital Computing. Tese (Doutorado) — Department of Computer Science, Stanford University, 1962. MOORE, R. E.; CLOUD, M. J.; KEARFOTT, R. B. Introduction to interval analysis. [S.l.]: Siam, 2009.
PAL, N.; BEZDEK, J. On cluster validity for the fuzzy c-means model. Fuzzy Systems, IEEE Transactions on, v. 3, n. 3, p. 370–379, 1995.
PEDRYCZ, W. Computational intelligence: an introduction. [S.l.]: CRC Press, 1998. RALAMBONDRAINY, H. A conceptual version of the k-means algorithm. Pattern Recognition Letters, v. 16, n. 11, p. 1147 – 1157, 1995.
RAND, W. M. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, v. 66, n. 336, p. 846–850, 1971.
REISER, R.; BEDREGAL, B. K-operators: An approach to the generation of interval-valued fuzzy implications from fuzzy implications and vice versa. Information Sciences, Elsevier, 2013.
REZAEE, B. A cluster validity index for fuzzy clustering. Fuzzy Sets and Systems, v. 161, n. 23, p. 3014 – 3025, 2010.
SANTANA, F.; SANTIAGO, R. Interval metrics, topology and continuous functions. Computational and Applied Mathematics, Springer Basel, v. 32, n. 3, p. 459–470, 2013. SANTANA, F. L. de. Generalizações do Conceito de Distância, i-Distâncias, Distâncias Intervalares e Topologia. Tese (Doutorado) — Programa de Pós-Graduação em Sistemas e Computação. Universidade Federal do Rio Grande do Norte, Natal-RN, Nov 2012.
SANTIAGO, R. H. N.; BEDREGAL, B. R. C.; ACIOLY, B. M. Formal aspects of correctness and optimality of interval computations. Formal Aspects of Computing, Springer-Verlag, v. 18, n. 2, p. 231–243, 2006.
SOUSA, Á. et al. Clustering of symbolic data based on affinity coefficient: Application to a real data set. Biometrical Letters, v. 50, n. 1, p. 27–38, 2013.
SOUZA, R. M. et al. Dynamic cluster methods for interval data based on mahalanobis distances. In: BANKS, D. et al. (Ed.). Classification, Clustering, and Data Mining Applications. [S.l.]: Springer Berlin Heidelberg, 2004, (Studies in Classification, Data Analysis, and Knowledge Organisation). p. 351–360.
SOUZA, R. M. C. R. de; CARVALHO, F. de A.T. de. Clustering of interval data based on city block distances. Pattern Recognition Letters, v. 25, n. 3, p. 353 – 365, 2004. SOUZA, R. M. de; CARVALHO, F. d. A. D. Clustering of interval data based on city block distances. Pattern Recognition Letters, Elsevier, v. 25, n. 3, p. 353–365, 2004. STEFANINI, L. A generalization of hukuhara difference and division for interval and fuzzy arithmetic. Fuzzy Sets and Systems, v. 161, n. 11, p. 1564 – 1584, 2010.
TAN, P.; STEINBACH, M.; KUMAR, V. Introduction To Data Mining. [S.l.]: Addison-Wesley, 2005.
TRINDADE, R. M. P. Uma Fundamentação Matemática para processamento digital de sinais intervalares. Tese (Doutorado) — Universidade Federal do Rio Grande do Norte– Centro de Tecnologia, 2009.
TRINDADE, R. M. P. et al. An interval metric. In: IEEE. New Advanced Technologies. [S.l.], 2011. p. 1–6.
TSAO, E. C.-K.; BEZDEK, J. C.; PAL, N. R. Fuzzy kohonen clustering networks. Pattern Recognition, v. 27, n. 5, p. 757 – 764, 1994.
TUPPER, J. A. Graphing equations with generalized interval arithmetic. Tese (Doutorado) — University of Toronto, 1996.
VARGAS, R. de; BEDREGAL, B. A comparative study between fuzzy c-means and ckmeans algorithms. In: Fuzzy Information Processing Society (NAFIPS), 2010 Annual Meeting of the North American. [S.l.: s.n.], 2010. p. 1–6.
VARGAS, R. R. de. Uma nova forma de calcular os centros dos clusters em algoritmos de agrupamento tipo Fuzzy C-Means. Tese (Doutorado) — Universidade Federal do Rio Grande do Norte– Centro de Tecnologia, 2012.
VARGAS, R. R. de; BEDREGAL, B. R. Interval ckmeans: An algorithm for clustering symbolic data. In: LAZINICA, A. (Ed.). Fuzzy Information Processing Society (NAFIPS), 2011 Annual Meeting of the North American. [S.l.], 2010.
WANG, H.; GUAN, R.; WU, J. Linear regression of interval-valued data based on complete information in hypercubes. Journal of Systems Science and Systems Engineering, Springer, v. 21, n. 4, p. 422–442, 2012.
WANG, W.; ZHANG, Y. On fuzzy cluster validity indices. Fuzzy Sets and Systems, v. 158, n. 19, p. 2095 – 2117, 2007.
XU, R.; WUNSCH D., I. Survey of clustering algorithms. Neural Networks, IEEE Transactions on, v. 16, n. 3, p. 645–678, 2005.
XU, Z.; YAGER, R. R. Some geometric aggregation operators based on intuitionistic fuzzy sets. International journal of general systems, Taylor & Francis, v. 35, n. 4, p. 417–433, 2006.