samedi 27 juillet 2013

Github's statistics : programming languages

That's it, we have our first mission for the french government. The deal seems simplistic : identify geeks on social networks to spot them and hire them. We are still searching but we have some interesting statistics we want to share with our followers : the mains Github's programming languages. What is the most used language by programmers on this social network ? This is not straightforward as you can see :



Details :

Rank Programming Language Percentage Number
1 JavaScript 24,71% 506131
2 Ruby 17,96% 367748
3 Python 10,33% 211614
4 PHP 8,96% 183467
5 Java 8,39% 171832
6 C 4,86% 99579
7 Objective-C 4,63% 94747
8 C++ 3,22% 66029
9 Shell 2,80% 57323
10 Perl 2,04% 41823
11 C# 1,95% 39874
12 VimL 1,59% 32654
13 CoffeeScript 0,98% 20008
14 Scala 0,89% 18165
15 Clojure 0,78% 15898
16 Haskell 0,67% 13786
17 Emacs Lisp 0,64% 13052
18 Go 0,63% 12916
19 Erlang 0,53% 10822
20 ActionScript 0,44% 9026
21 Lua 0,38% 7694
22 Puppet 0,30% 6151
23 Groovy 0,26% 5374
24 Common Lisp 0,22% 4425
25 R 0,21% 4233
26 Arduino 0,12% 2377
27 Scheme 0,11% 2179
28 OCaml 0,10% 2101
29 Matlab 0,09% 1856
30 Assembly 0,09% 1756
31 D 0,08% 1610
32 ColdFusion 0,07% 1416
33 Haxe 0,07% 1341
34 Visual Basic 0,06% 1196
35 Dart 0,06% 1140
36 Racket 0,05% 1034
37 Rust 0,05% 1016
38 F# 0,05% 971
39 PowerShell 0,05% 932
40 Delphi 0,04% 835
41 Prolog 0,03% 680
42 TypeScript 0,03% 679
43 Objective-J 0,03% 615
44 Processing 0,03% 585
45 Julia 0,03% 565
46 Verilog 0,03% 525
47 FORTRAN 0,02% 504
48 Elixir 0,02% 489
49 Tcl 0,02% 448
50 ASP 0,02% 435
51 XML 0,02% 434
52 Vala 0,02% 391
53 VHDL 0,02% 347
54 AutoHotkey 0,02% 343
55 Smalltalk 0,02% 319
56 Apex 0,02% 314
57 Logos 0,01% 274
58 Pure Data 0,01% 274
59 XSLT 0,01% 235
60 AppleScript 0,01% 219
61 SuperCollider 0,01% 217
62 Standard ML 0,01% 216
63 XQuery 0,01% 209
64 ooc 0,01% 198
65 Ada 0,01% 189
66 Io 0,01% 183
67 Coq 0,01% 149
68 OpenEdge ABL 0,01% 144
69 Lasso 0,01% 141
70 Eiffel 0,01% 140
71 DOT 0,01% 132
72 Arc 0,01% 130
73 Gosu 0,00% 101
74 Kotlin 0,00% 96
75 Nemerle 0,00% 95
76 Factor 0,00% 94
77 LiveScript 0,00% 88
78 Opa 0,00% 62
79 Scilab 0,00% 61
80 Nu 0,00% 60
81 Boo 0,00% 55
82 Mirah 0,00% 53
83 DCPU-16 ASM 0,00% 53
84 Dylan 0,00% 51
85 Nimrod 0,00% 46
86 Parrot 0,00% 45
87 Augeas 0,00% 37
88 Rebol 0,00% 35
89 Awk 0,00% 33
90 Turing 0,00% 23
91 Ceylon 0,00% 22
92 Fancy 0,00% 20
93 Monkey 0,00% 19
94 Bro 0,00% 18
95 PogoScript 0,00% 16
96 Xtend 0,00% 14
97 M 0,00% 12
98 Self 0,00% 10
99 Ragel in Ruby Host 0,00% 8
100 Ioke 0,00% 8
101 MoonScript 0,00% 7
102 Max 0,00% 6
103 eC 0,00% 5
104 Logtalk 0,00% 5
105 CLIPS 0,00% 5
106 Forth 0,00% 4
107 ABAP 0,00% 3
108 Rouge 0,00% 3
109 Fantom 0,00% 3
110 Ecl 0,00% 2
111 Max/MSP 0,00% 2
112 XProc 0,00% 1
113 wisp 0,00% 1
114 XC 0,00% 1










Total 2048137


We crawled some 2048137 code repositories on Github and identified the programming language used by the programmer. Each result is stored in a Neo4j database, and queries are written in Cypher.

More than just a study of programming languages, this mission has enabled us to discover the unknown programming languages.

Thanks to WolverineX02 for its help on how to use Java with Python

1 commentaire: