• No results found

Precision x Recall curve of models

Chapter 7 Results 43

7.1.1 Precision x Recall curve of models

To calculate the Precision x Recall curve of the models used in this project the code developed (by Rafael Padilla) [1] was used. Each model was run on 100 images, whose ground truth values were known, these ground truth values were stored in .txt format, where each row corresponds to the class and its bounding box coordinates. The predicted bounding box coordinates and the confidence yielded from each image applying the model, was also stored in a .txt format, with the file name same as image name. 100 txt files were created, each corresponding to the image, predictions were made on. Example of these txt files are shown below:

Ground Truth .txt

player 3274 849 3330 940 player 3515 749 3550 833 player 2654 822 2715 942

Predictions .txt

player 0.65330 3171 1016 3214 1086 player 0.57810 2988 800 3034 905 player 0.54516 3894 1366 3965 1487

Using PASCAL VOC evaluation metric on these txt files, Precision x Recall Curves were generated to see the performance of the models. These are depicted in the plots presented in figure 7.3 along with their model information. Since the GMM and Deep Simple Real time

7.3 VISUALOUTPUTS 47

Tracking (SORT) can not work on static images, the plot for both methods are not included in the figures below

Viewing the results shown in figure [7.3], it can be seen that the custom trained model reached a much higher accuracy than the rest of the models.

7.2 Frames Per Second (FPS)

As this project aims to translate the predicted detections on smaller screens in real time, the FPS of the model needs to yield higher through put for better performance. Since all the models were tested on CPU for bench marking a very low through put was observed when using deep learning models. Background subtraction method on the other hand, produced a higher throughput rate comparatively. This is shown in the table 7.1. Figures shown in the table are collected by loading a 2 minute 30 second clip of∼442 MB size.

Frame Per Second Throughput

TABLE 7.1. The models were run between 30 seconds to 60 seconds, to calculate FPS

7.3 Visual Outputs

For the purpose of visualization, outputs were generated using different models. Since GMM background Subtraction depends on history of frames to calculate foreground, video stream was used to generate the output for it. For the rest of the models, single image was used to generate results. The outputs contain the frame on which model is being applied on, at the top and a small screen representation at the bottom, with players shown as colored circle (i.e.

(a) YOLO on pre-trained YOLO weigths (b) Faster R-CNN on pre-trained COCO weights

(c) SSD on pre-trained COCO weights (d) Background subtraction, Absolute difference

(e) Custom trained model on Faster R-CNN

FIGURE 7.3. Precision x Recall Curve with Average precision of models used

7.3 VISUALOUTPUTS 49

FIGURE 7.4. Result using GMM model white for team with white jersey and black for team with black jersey).

The result shows that the goal keeper of the black team was not detected. Also at the background where players are highly occluded the model was unable to detect the players with high accuracy. The model also often makes mistakes while recognizing the team of the player. This happens due to the fact that the players are occluded and the bounding box contains more white pixels than the black or vice versa. Other than these limitations the model performed good accuracy in detecting the players on the pitch and showing them on the smaller screen.

FIGURE 7.5. Result using Image subtraction model

As it can be viewed that there were several False detections generated by the model. This was mainly because, a lot of noise was being produced by the model after generating the foreground. Dedicated checks were performed to eliminate false detections, however, all the false detections could not be removed completely.

7.3 VISUALOUTPUTS 51

FIGURE 7.6. Result using Faster R-CNN model with pre-trained COCO weights The Faster R-CNN model trained on the COCO data set, was unable to yield high accuracy at detecting the players. It can be viewed that most of the players on the pitch are not being detected. Since the COCO data set are the images of various objects with out any distortion in the camera angle, the trained weights did not perform very good.

FIGURE 7.7. Result using YOLO model with pre-trained COCO weights

YOLO model trained on the COCO data set, was able to yield better accuracy than Faster R-CNN. YOLO’s performance in general is better than rest of the models, due to its architec-ture. It can be seen that both goal keeper are marked as belonging to the same team. This is due to the fact that script created to detect the strength of the color considers the cropped images inside the bounding box, these cropped images contains the background as well which contains white lines on the football ground, hence adding to the strength of color, while not being part of player jersey.

7.3 VISUALOUTPUTS 53

FIGURE 7.8. Result using Faster R-CNN model with custom trained weights The custom trained model using Faster R-CNN was able to yield higher performance than rest of the deep learning models. It can be seen, the model performed a good job in detecting the white players, but could not detect player from the black team properly. It was anticipated that since the model has an Accuracy of 81%, it was not able to detect all the players in all the frames.

Conclusion

The aim of this project was to retrieve the position of players on the football field, from a video stream and illustrate it on a smaller screen. Several models were created and tested against each other to localize the position of the players. Each model was evaluated based on the accuracy yielded and the throughput rate. These positions were then mapped out on to the smaller screen as an aerial view. Position of each player was represented on a 2D image using a circle, from each frame of a video. To make the system real time, a cloud integration along with a native application was developed to show the positions on to a mobile phone.

Extracting the positions of the players using background subtraction methods or the deep learning models, were not able to produce high accuracy at detecting the players. The GMM background subtraction method worked best when the players were not occluded, as it was able to detect the players present at the back (away from the camera). A custom trained deep learning model was able to perform much better when players were occluded, compared to Background subtraction model. Deep learning models were also computationally very expensive, which made their run time very slow as compared to Background subtraction methods. The process time per frame of each model is shown in table 7.1, which are calculated on a CPU and can be increased with the introduction of a GPU in the system.

It was hard to quantify the representation of the players on the smaller screen, due to the presence of distortion in the camera angle of video stream. It can not be stated with certainty, whether the players being shown on the smaller 2D image (425× 640) were represented accurately regarding their position in the video frame. The players were represented based on the position of their feet, using a circle, having a radius of 3 pixels.

54

8.1 FUTURE OUTLOOK 55

8.1 Future outlook

There is room for further improvements in this project as well as the positional data extracted can be used to create predictive machine learning models. These models will play a vital role in improving the capabilities of players and also creating new strategies for teams.

8.1.1 Model Improvement

The data set created to train the deep learning model for this project, will be available and can be used to train new models. At the time of writing this project, new models like YOLOv4, YOLOv5 [33] have been introduced. These models are claimed to be better and faster than rest of the CNN models. If these models are implemented on the video stream, it is anticipated that better accuracy can be achieved.

Object tracking algorithms like Kalaman filter, centroid tracking can be used to track the movement of an individual player. These tracking techniques do not work best when objects under consideration have similar colors, which in this case was true, considering the jerseys of the players. However, newer techniques can be developed for object tracking, which do not rely on the color of the object but its position and momentum. Introducing the tracking algorithm, ID can be assigned to each detection. This would help in calculating the stats of the players on individual level, without the use of sensor technology.

8.1.2 E2E models

End to End deep learning models like MuZero [34] can be used on the positional data collected.

These models can be used to devise new strategies for the game of football. As these models can simulate different scenarios during training and can come up with better strategies than a human, doing so the whole paradigm of sports analysis can be shifted and an Artificial Intelligence (AI) coach can be developed. This AI coach if trained up to some extent, can provide with real-time analysis and strategies.

8.1.3 Smart Watches

Being low in size, in terms of storage, the data can be viewed on to much smaller screens like smart watches. Since some smart watches has a rectangular dial, e.g. Apple watch, positional data can be displayed on it using small dots This would allow more mobility, hence providing a better and faster way to analyze the data.

Bibliography

[1] Sergio Lima Netto Rafael Padilla and Eduardo A. B. da Silva. Survey on performance metrics for object-detection algorithms. 2020.

[2] Simen Sægrov, Alexander Eichhorn, Jørgen Emerslund, Håkon Kvale Stensland, Carsten Griwodz, Dag Johansen, and Pål Halvorsen. Bagadus an integrated system for soccer analysis. In2012 Sixth International Conference on Distributed Smart Cameras (ICDSC), pages 1–2. IEEE.

[3] Håkon Kvale Stensland, Vamsidhar Reddy Gaddam, Marius Tennøe, Espen Helgedag-srud, Mikkel Næss, Henrik Kjus Alstad, Asgeir Mortensen, Ragnar Langseth, Sigurd Ljødal, Oystein Landsverk, Carsten Griwodz, Pål Halvorsen, Magnus Stenhaug, and Dag Johansen. Bagadus: An integrated real-time system for soccer analytics. Transactions on Multimedia Computing, Communications and Applications, 10.

[4] S.A. Pettersen, D. Johansen, H. Johansen, V. Berg-Johansen, V.R. Gaddam, A. Mortensen, R. Langseth, C. Griwodz, H.K. Stensland, and P. Halvorsen. Soccer video and player position dataset". InProceedings of the International Conference on Multimedia Systems (MMSys, page 18–23, Singapore.

[5] FFmpeg Developers. Ffmpeg tool [sofware], 2016.

[6] Svein Arne Pettersen, Dag Johansen, Håvard Johansen, Vegard Berg-Johansen, Vamsid-har Reddy Gaddam, Asgeir Mortensen, Ragnar Langseth, Carsten Griwodz, Håkon Kvale Stensland, and Pål Halvorsen. Soccer video and player position dataset. In Proceed-ings of the 5th ACM Multimedia Systems Conference, MMSys ’14, page 18–23, New York, NY, USA, 2014. Association for Computing Machinery. ISBN 9781450327053.

doi: 10.1145/2557642.2563677. URLhttps://doi.org/10.1145/2557642.

2563677.

[7] Tzutalin. LabelImg. Git code. URL https://github.com/tzutalin/

labelImg.

57

[8] Ahmed Elgammal, David Harwood, and Larry Davis. Non-parametric model for back-ground subtraction. In David Vernon, editor,Computer Vision — ECCV 2000, pages 751–767, Berlin, Heidelberg, 2000. Springer Berlin Heidelberg. ISBN 978-3-540-45053-5.

[9] Helly M Desai and Vaibhav Gandhi. A survey: Background subtraction techniques.

International Journal of Scientific & Engineering Research, 5(12):1365, 2014.

[10] Ravi Srisha and Am Khan. Morphological operations for image processing : Under-standing and its applications. 12 2013.

[11] George Green. An essay on the application of mathematical analysis to the theories of electricity and magnetism. which appears on pages 10–12 of his Essay.

[12] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

[13] Goran Nakerst, John Brennan, and Masudul Haque. Gradient descent with momentum

— to accelerate or to super-accelerate? arXiv e-prints, art. arXiv:2001.06472, January 2020.

[14] Karen Simonyan Andrew Zisserman. Very deep convolutional networks for large-scale image recognition.

[15] J. Heikkila and O. Silven. A real-time system for monitoring of cyclists and pedestrians.

In Proceedings Second IEEE Workshop on Visual Surveillance (VS’99) (Cat. No.98-89223), pages 74–81, 1999.

[16] Schwarz Gideon et al. Estimating the dimension of a model. The annals of statistics, 6 (2):461–464, 1978.

[17] S. Suzuki and K. Abe. Topological structural analysis of digitized binary images by border following. CVGIP, 30 1:32–46.

[18] Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng.

Tensorflow: A system for large-scale machine learning. In12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265–283, 2016.

BIBLIOGRAPHY 59

URL https://www.usenix.org/system/files/conference/osdi16/

osdi16-abadi.pdf.

[19] Jan Hosang, Rodrigo Benenson, and Bernt Schiele. Learning non-maximum suppression.

In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4507–4515, 2017.

[20] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. Ssd: Single shot multibox detector. In European conference on computer vision, pages 21–37. Springer, 2016.

[21] Christian Szegedy, Scott Reed, Dumitru Erhan, Dragomir Anguelov, and Sergey Ioffe.

Scalable, High-Quality Object Detection. arXiv e-prints, art. arXiv:1412.1441, Decem-ber 2014.

[22] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.

[23] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.

[24] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. InCVPR09, 2009.

[25] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InEuropean conference on computer vision, pages 740–755. Springer, 2014.

[26] Richard Hartley and Andrew Zisserman. Multiple view geometry in computer vision.

Cambridge university press.

[27] Laurence Moroney. The firebase realtime database. InThe Definitive Guide to Firebase, pages 51–71. Springer, 2017.

[28] Bonnie Eisenman. Learning react native: Building native mobile apps with JavaScript.

" O’Reilly Media, Inc.", 2015.

[29] Hammad Ali. Footy-redar app code. 2020. URLhttps://github.com/hmdall/

mob-viewer. Available at:.

[30] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisser-man. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1):98–136, January 2015.

[31] Ivan Dokmanic, Reza Parhizkar, Juri Ranieri, and Martin Vetterli. Euclidean Distance Matrices: Essential theory, algorithms, and applications. IEEE Signal Processing Magazine, 32(6):12–30, November 2015.

[32] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python.

Journal of Machine Learning Research, 12:2825–2830, 2011.

[33] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv e-prints, art. arXiv:2004.10934, April 2020.

[34] Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, and David Silver. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. arXiv e-prints, art. arXiv:1911.08265, November 2019.

APPENDIX A

Appendix A

A1 Background Subtraction Implementation code

A1.1 Extracting Background from Video

d e f e x t r a c t _ b a c k g r o u n d ( v i d e o F i l e ) :

" " " R e c e i v e s a v i d e o f i l e n a m e ( w i t h e x t e n s i o n ) and s a v e s t h e b a c k g r o u n d i m a g e i n s p e c i f i e d f o l d e r " " "

v i d e o = cv2 . V i d e o C a p t u r e ( v i d e o F i l e ) c = 0

w h i l e v i d e o . i s O p e n e d ( ) : _ , img = v i d e o . r e a d ( )

a v e r a g e _ = img f c = f l o a t ( c )

a v e r a g e _ = ( f c * avg_img + img ) / ( f c +1) c += 1

cv2 . i m w r i t e ( ’ p a t h _ o f _ f i l e ’ , a v e r a g e _ )

A1.2 Binary Masking

61

bg = cv2 . i m r e a d ( ’ img / b a c k g r o u n d . j p g ’ )

bg_img = cv2 . c v t C o l o r ( bg , cv2 . COLOR_BGR2GRAY)

d e f t r a c k _ p l a y e r ( v i d e o _ f i l e ) :

c a p = cv2 . V i d e o C a p t u r e ( v i d e o F i l e )

w h i l e c a p . I s O p e n e d ( ) :

r e t , f r a m e = c a p . r e a d ( ) i f n o t r e t :

break

g r a y _ i m g = cv2 . c v t C o l o r ( f r a m e , cv2 . COLOR_BGR2GRAY) b g _ d e l t a = cv2 . a b s d i f f ( bg_img , g r a y _ i m g )

t h r e s h o l d = cv2 . t h r e s h o l d ( b g _ d e l t a , 3 0 , 2 5 5 , cv2 . THRESH_BINARY ) [ 1 ]

t h r e s h o l d = cv2 . e r o d e ( t h r e s h o l d , ( 1 3 , 1 3 ) , i t e r a t i o n s = 1 ) t h r e s h o l d = cv2 . d i l a t e ( t h r e s h o l d , ( 1 0 , 1 0 ) , i t e r a t i o n s = 1 )

t h r e s h o l d = cv2 . e r o d e ( t h r e s h o l d , ( 1 1 , 1 1 ) , i t e r a t i o n s = 1 ) t h r e s h o l d = cv2 . d i l a t e ( t h r e s h o l d , ( 8 , 8 ) , i t e r a t i o n s = 1 )

c o n t o u r s , _ = cv2 . f i n d C o n t o u r s ( t h r e s h o l d . c o p y ( ) ,

cv2 . RETR_EXTERNAL , cv2 . CHAIN_APPROX_SIMPLE )

A1.3 Finding n-components for GMM

i m p o r t cv2

from s k l e a r n . m i x t u r e i m p o r t G a u s s i a n M i x t u r e a s GMM

A1 BACKGROUNDSUBTRACTIONIMPLEMENTATION CODE 63

f r a m e = cv2 . i m r e a d ( ’ img / 0 . png ’ ) img = f r a m e . r e s h a p e ( (−1 , 3 ) )

n _ c o m p o n e n t s = np . a r a n g e ( 1 , 1 0 )

gmm_model = [GMM( n , c o v a r i a n c e _ t y p e = ’ t i e d ’ ) . f i t ( img ) f o r n i n n _ c o m p o n e n t s ]

p l t . p l o t ( n _ c o m p o n e n t s , [m. b i c ( img ) f o r m i n gmm_model ] , l a b e l = " BIC " )

p l t . x l a b e l ( ’ n _ c o m p o n e n t s ’ )

A1.4 Gaussian Mixture Model

i m p o r t cv2

b a c k _ s u b = cv2 . bgsegm . c r e a t e B a c k g r o u n d S u b t r a c t o r M O G ( h i s t o r y = 5 , n m i x t u r e s = 8 )

v i d _ c a p = cv2 . V i d e o C a p t u r e ( v i d e o _ f i l e _ p a t h )

w h i l e v i d _ c a p . i s O p e n e d ( ) : r e t , img = v i d _ c a p . r e a d ( ) i f n o t r e t :

break

fgMask = b a c k _ s u b .a p p l y( img )

fgMask = cv2 . e r o d e ( fgMask , ( 1 3 , 1 3 ) , i t e r a t i o n s = 1 ) fgMask = cv2 . d i l a t e ( fgMask , ( 2 , 2 ) , i t e r a t i o n s = 1 ) fgMask = cv2 . e r o d e ( fgMask , ( 1 3 , 1 3 ) , i t e r a t i o n s = 1 ) fgMask = cv2 . d i l a t e ( fgMask , ( 2 , 2 ) , i t e r a t i o n s = 2 )

c o n t o u r s , _ = cv2 . f i n d C o n t o u r s ( fgMask . c o p y ( ) , cv2 . RETR_EXTERNAL , cv2 . CHAIN_APPROX_SIMPLE )

A1.5 Finding Contours

# F i n d i n g C o n t o u r s on e a c h f r a m e i n s i d e w h i l e l o o p

c o n t o u r s , _ = cv2 . f i n d C o n t o u r s ( fgMask . c o p y ( ) , cv2 . RETR_EXTERNAL , cv2 . CHAIN_APPROX_SIMPLE )

p l a y e r _ p o s = [ ]

# I t e r a t i n g o v e r c o n t o u r s : f o r cn i n c o n t o u r s :

( x , y , w , h ) = cv2 . b o u n d i n g R e c t ( cn )

f e e t _ c o o r d = [ f l o a t ( x + i n t(w / 2 . 0 ) ) , f l o a t ( y + h ) ] f e e t s = P o i n t ( f e e t _ c o o r d [ 0 ] , f e e t _ c o o r d [ 1 ] )

r e c t _ a r e a = cv2 . c o n t o u r A r e a ( cn )

# C h e c k i n g i f d e t e c t i o n i s i n s i d e t h e p o l y g o n d e f i n e d i f n o t f i e l d _ p o l y g o n _ p o i n t s . c o n t a i n s ( f e e t s ) :

c o n t i n u e

# P e r f o r m i n g S e v e r a l c h e c k s f o r p l a y e r d e t e c t i o n i f n o t h > w * 1 . 5 :

c o n t i n u e

i f r e c t _ a r e a < 1 0 0 : c o n t i n u e

i f w > h : c o n t i n u e

A2 DEEPLEARNINGIMPLEMENTATIONS CODE 65

i f y > 1000 and 600 < x < 2900 and r e c t _ a r e a < 1 0 0 0 : c o n t i n u e

i f 800 < y < 1000 and 600 < x < 2900 and r e c t _ a r e a < 5 0 0 : c o n t i n u e

i f 600 < y < 800 and 600 < x < 2900 and r e c t _ a r e a < 3 0 0 : c o n t i n u e

i f y > 1000 and x > 2900 and r e c t _ a r e a < 2 0 0 : c o n t i n u e

p l a y e r . a p p e n d ( [ x , y , x+w , y+h ] )

A2 Deep Learning Implementations code

A2.1 Faster R-CNN implementation

from o b j e c t _ d e t e c t i o n . u t i l s i m p o r t o p s a s u t i l s _ o p s from o b j e c t _ d e t e c t i o n . u t i l s i m p o r t l a b e l _ m a p _ u t i l from c o l l e c t i o n s i m p o r t d e f a u l t d i c t

i m p o r t t e n s o r f l o w a s t f i m p o r t numpy a s np

i m p o r t cv2

d e f l o a d _ m o d e l ( model_name ) :

m o d e l _ d i r = p a t h l i b . P a t h ( model_name ) / " s a v e d _ m o d e l "

model = t f . s a v e d _ m o d e l . l o a d (s t r( m o d e l _ d i r ) , None )

model = model . s i g n a t u r e s [ ’ s e r v i n g _ d e f a u l t ’ ]

A2 DEEPLEARNINGIMPLEMENTATIONS CODE 67

c o n f T h r e s h o l d = 0 . 3

A2 DEEPLEARNINGIMPLEMENTATIONS CODE 69

c l a s s I d s = [ ]

A3 DEEPSORTIMPLEMENTATION 71

o u t _ d u c t = r u n _ i n f e r e n c e _ f o r _ s i n g l e _ i m a g e ( d e t e c t i o n _ m o d e l , i m a g e )

i f l e n( t r a c k e r s ) < 23 & c o u n t % 30 == 0 : f o r i i n o u t _ d u c t [ ’ d e t e c t i o n _ b o x e s ’ ] :

ymin = i n t( i [ 0 ] * h ) xmin = i n t( i [ 1 ] * w) ymax = i n t( i [ 2 ] * h ) xmax = i n t( i [ 3 ] * w)

f e e t _ c o o r d = [ f l o a t ( ( xmin+xmax ) / 2 ) , f l o a t ( ymax ) ] f e e t s = P o i n t ( f e e t _ c o o r d [ 0 ] , f e e t _ c o o r d [ 1 ] )

i f n o t f i e l d _ p o l y g o n _ p o i n t s . c o n t a i n s ( f e e t s ) : c o n t i n u e

t = d l i b . c o r r e l a t i o n _ t r a c k e r ( )

r e c t = d l i b . r e c t a n g l e ( xmin , ymin , xmax , ymax ) t . s t a r t _ t r a c k ( r g b , r e c t )

t r a c k e r s . a p p e n d ( t ) f p s . u p d a t e ( )

e l s e :

f o r t i n t r a c k e r s :

# u p d a t e t h e t r a c k e r and g r a b t h e p o s i t i o n o f t h e t r a c k e d

# o b j e c t

t . u p d a t e ( r g b )

p o s = t . g e t _ p o s i t i o n ( )

A3 DEEPSORTIMPLEMENTATION 73

cv2 . s e t M o u s e C a l l b a c k ( ’ i m a g e ’ , m o u s e _ p o i n t s ) i f cv2 . w a i t K e y ( 0 ) & 0 xFF == 2 7 :

break

p r i n t( ’ f i n a l p o i n t s c o l l e c t e d {} ’ .f o r m a t( p o s L i s t ) ) np . s a v e t x t ( p o s L i s t , p o i n t s _ f i l e )

A3.2 Point Checker

f r o m s h a p e l y . g e o m e t r y i m p o r t P o i n t

f r o m s h a p e l y . g e o m e t r y . p o l y g o n i m p o r t P o l y g o n

p o i n t s = np . l o a d t x t ( p o i n t s _ f i l e ) p o i n t s = p o i n t s . a s t y p e ( i n t )

f i e l d _ p o l y g o n _ p o i n t s = P o l y g o n ( p o i n t s )

f o r cn i n c o n t o u r s :

f e e t _ c o o r d = [ f l o a t ( x + i n t (w / 2 . 0 ) ) , f l o a t ( y + h ) ] f e e t s = P o i n t ( f e e t _ c o o r d [ 0 ] , f e e t _ c o o r d [ 1 ] )

i f n o t f i e l d _ p o l y g o n _ p o i n t s . c o n t a i n s ( f e e t s ) : c o n t i n u e

A3.3 HSV Color picker

i m p o r t cv2

i m p o r t numpy a s np

c a p = cv2 . V i d e o C a p t u r e ( v i d _ f i l e p a t h )

A3 DEEPSORTIMPLEMENTATION 75

d e f n o t h i n g ( x ) : p a s s

# C r e a t i n g a window f o r l a t e r u s e cv2 . namedWindow ( ’ r e s u l t ’ )

# S t a r t i n g w i t h 1 0 0 ’ s t o p r e v e n t e r r o r w h i l e m a s k i n g h , s , v = 1 0 0 , 1 0 0 , 1 0 0

# C r e a t i n g t r a c k b a r

cv2 . c r e a t e T r a c k b a r ( ’ h ’ , ’ r e s u l t ’ , 0 , 1 7 9 , n o t h i n g ) cv2 . c r e a t e T r a c k b a r ( ’ s ’ , ’ r e s u l t ’ , 0 , 2 5 5 , n o t h i n g ) cv2 . c r e a t e T r a c k b a r ( ’ v ’ , ’ r e s u l t ’ , 0 , 2 5 5 , n o t h i n g )

w h i l e ( 1 ) :

_ , f r a m e = c a p . r e a d ( )

# c o n v e r t i n g t o HSV

h s v = cv2 . c v t C o l o r ( f r a m e , cv2 . COLOR_BGR2HSV)

# g e t i n f o f r o m t r a c k b a r and a p p y t o r e s u l t h = cv2 . g e t T r a c k b a r P o s ( ’ h ’ , ’ r e s u l t ’ )

s = cv2 . g e t T r a c k b a r P o s ( ’ s ’ , ’ r e s u l t ’ ) v = cv2 . g e t T r a c k b a r P o s ( ’ v ’ , ’ r e s u l t ’ )

# Normal m a s k i n g a l g o r i t h m l o w e r _ b l u e = np . a r r a y ( [ h , s , v ] )

u p p e r _ b l u e = np . a r r a y ( [ 1 8 0 , 2 5 5 , 2 5 5 ] )

mask = cv2 . i n R a n g e ( hsv , l o w e r _ b l u e , u p p e r _ b l u e )

r e s u l t = cv2 . b i t w i s e _ a n d ( f r a m e , f r a m e , mask = mask )

cv2 . imshow ( ’ r e s u l t ’ , r e s u l t )

k = cv2 . w a i t K e y ( 5 ) & 0 xFF i f k == 2 7 :

b r e a k

c a p . r e l e a s e ( )

cv2 . d e s t r o y A l l W i n d o w s ( )

A3.4 Color Pixel Calculator

c l a s s D e t e c t i o n W i t h C o l o r : d e f _ _ i n i t _ _ ( s e l f ) :

s e l f . l o w e r _ w h i t e = np . a r r a y ( [ 6 0 , 0 , 2 0 4 ] ) s e l f . u p p e r _ w h i t e = np . a r r a y ( [ 1 7 9 , 3 8 , 2 5 5 ] )

s e l f . l o w e r _ b l a c k = np . a r r a y ( [ 0 , 0 , 0 ] )

s e l f . u p p e r _ b l a c k = np . a r r a y ( [ 3 6 0 , 1 0 0 , 5 0 ] )

s e l f . l o w e r _ y e l l o w = np . a r r a y ( [ 1 8 , 1 8 2 , 1 3 0 ] ) s e l f . u p p e r _ y e l l o w = np . a r r a y ( [ 6 7 , 2 5 5 , 2 5 5 ] ) d e f d e t e c t P i x e l C o u n t ( s e l f , p l a y e r _ i m g , x , y ) :

A3 DEEPSORTIMPLEMENTATION 77

p l a y e r _ h s v = cv2 . c v t C o l o r ( p l a y e r _ i m g , cv2 . COLOR_BGR2HSV)

mask3 = cv2 . i n R a n g e ( p l a y e r _ h s v ,

s e l f . l o w e r _ y e l l o w , s e l f . u p p e r _ y e l l o w )

r e s 3 = cv2 . b i t w i s e _ a n d ( p l a y e r _ i m g , p l a y e r _ i m g , mask=mask3 ) r e s 3 = cv2 . c v t C o l o r ( r e s 1 , cv2 . COLOR_HSV2BGR)

r e s 3 = cv2 . c v t C o l o r ( r e s 1 , cv2 . COLOR_BGR2GRAY) y e l l o w C o u n t = cv2 . c o u n t N o n Z e r o ( r e s 1 )

i f y e l l o w C o u n t > 5 0 : r e t u r n ’ r ’

mask1 = cv2 . i n R a n g e ( p l a y e r _ h s v , s e l f . l o w e r _ w h i t e , s e l f . u p p e r _ w h i t e , mask=mask1 )

r e s 1 = cv2 . b i t w i s e _ a n d ( p l a y e r _ i m g , p l a y e r _ i m g , mask=mask1 ) r e s 1 = cv2 . c v t C o l o r ( r e s 1 , cv2 . COLOR_HSV2BGR)

r e s 1 = cv2 . c v t C o l o r ( r e s 1 , cv2 . COLOR_BGR2GRAY) w h i t e C o u n t = cv2 . c o u n t N o n Z e r o ( r e s 1 )

mask2 = cv2 . i n R a n g e ( p l a y e r _ h s v ,

s e l f . l o w e r _ b l a c k , s e l f . u p p e r _ b l a c k )

r e s 2 = cv2 . b i t w i s e _ a n d ( p l a y e r _ i m g , p l a y e r _ i m g , mask=mask2 ) r e s 2 = cv2 . c v t C o l o r ( r e s 2 , cv2 . COLOR_HSV2BGR)

r e s 2 = cv2 . c v t C o l o r ( r e s 2 , cv2 . COLOR_BGR2GRAY) b l a c k C o u n t = cv2 . c o u n t N o n Z e r o ( r e s 2 )

i f w h i t e C o u n t > b l a c k C o u n t :

r e t u r n w h i t e C o u n t , "w"

A3 DEEPSORTIMPLEMENTATION 79

s e l f . f i r e b a s e = f i r e b a s e . F i r e b a s e A p p l i c a t i o n (

’ h t t p s : / / f o o t y−r e d a r . f i r e b a s e i o . com ’ , None )

# d a t a i s t h e a r r a y o f s c a l e d d e t e c t i o n s d e f p u t D a t a ( s e l f , d a t a ) :

n e w _ d i c t = {}

f o r i i n d a t a :

i f s t r ( i [ 1 ] ) n o t i n n e w _ d i c t . k e y s ( ) : n e w _ d i c t [ i [ 1 ] ] = [ ]

n e w _ d i c t [ i [ 1 ] ] . a p p e n d ( i [ 0 ] ) e l s e :

n e w _ d i c t [ i [ 1 ] ] . a p p e n d ( i [ 0 ] ) p r i n t ( ’ d i c t i o n a r y : ’ , n e w _ d i c t )

i f " b " i n n e w _ d i c t and "w" i n n e w _ d i c t : r e s u l t = s e l f . f i r e b a s e . p u t ( ’ / p o i n t s / ’ ,

’/−M7−fgFoB4YiVX4LVQQ8 / ’ , n e w _ d i c t )