+TITLE: hcount
Counting of Haskell names and artifacts usage.
hcount --help
count Haskell artifacts Usage: hcount COMMAND [--all | --operators | --lower | --upper | --local] [-s|--stem ARG] [-d|--directory ARG] [-r|--recursize] [-e|--exclude ARG] [-n|--topn ARG] hcount Available options: --all all names --operators operators --lower lower-case functions --upper upper-case constructors --local local variable names -s,--stem ARG hie files directory -d,--directory ARG base directory -r,--recursize search recursively -e,--exclude ARG libraries to exclude -n,--topn ARG report top N results -h,--help Show this help text Available commands: report report counts. build build projects rebuild rebuild projects. addcpl Add a CPL file.
The main command is `hcount report` which reports the top names across the project (or projects). For example, this reports the top 5 lower-case functions in this project:
hcount report -n 5 --lower
Number of repos: 1 Number of files: 1 CExt view 13 CExt fmap 12 CExt help 10 CExt long 10 CExt takeRest 9
And this reports the top 5 names across all projects in the parent directory:
hcount report -n 5 -r -d ".." --lower
Number of repos: 19 Number of files: 112 CExt view 446 CExt pure 388 CExt set 385 CExt fromIntegral 277 CExt fmap 267
As well as (simplified) names, hcount
also provides a (rough) category:
-- | A simplified categorization of name types
data NameCats
= -- | Operators ("$_in$$c" prefix)
COps
| -- | Contructors ("$_in$$d" prefix)
CCon
| -- | Variables ("$_in$" prefix)
CVars
| COther
| CError
| -- | external project name
CExt
The other commands should be used cautiously, and will cause modification to Haskell projects.
hcount addcpl
will add the necessary ghc-options via adding a cabal.project.local
file to the project(s). It will not overwrite existing files.
The options can also be manually added somewhere:
ghc-options: -fwrite-ide-info -hiedir=.hie
hcount build
will build the project(s) and thus refresh hie files, prior to reporting.
hcount rebuild
will rebuild the project(s) (cabal clean && cabal build all
), and thus refresh hie files, prior to reporting.
This section contains code snippets used in the development of the app.
Results for the snippet below and the app should be the same.
report defaultOptions
Number of repos: 1 Number of files: 1 CCon ~ 98 CCon Semigroup 72 CCon Functor 70 CCon Category 60 CCon Monad 60 CCon Eq 51 CCon Show 38 CCon Applicative 36 CExt <> 36 CCon Ord 34
hcount report
Number of repos: 1 Number of files: 1 CCon ~ 98 CCon Semigroup 72 CCon Functor 70 CCon Category 60 CCon Monad 60 CCon Eq 51 CCon Show 38 CCon Applicative 36 CExt <> 36 CCon Ord 34
These snippets were used to develop and construct the pipeline from hie files to names.
:set -XOverloadedLabels
:set -Wno-name-shadowing
fss <- getFiles defaultOptions
length fss
length $ mconcat fss
nodes = mconcat fss & fmap (hie_asts >>> getAsts >>> Map.toList) & mconcat & fmap snd
length nodes
flatten h= (Map.elems $ getSourcedNodeInfo $ sourcedNodeInfo h) <> mconcat (fmap astFlatten (nodeChildren h))
idents = nodes & fmap flatten & mconcat
length idents
ns = idents & getNames
length ns
xs = ns & getSimplifiedNames
length xs
1 1 1 2234 2439 2439
There are a lot of internal Constructor names coming from GHC and these dominate the top 10 lists.
:set -XOverloadedStrings
xs' = ns & fmap toNameX
lxs = view #name <$> filter (\x -> module' x == "") xs'
exs = filter (\x -> module' x /= "") xs'
allNames = (second FP.utf8ToStr <$> rp deconstructLocalName <$> lxs) <> ((CExt,) . view #name <$> exs)
reportTopX 10 (const True) id (view #name <$> xs')
$_in$$d~ 98 $_in$$dSemigroup 78 $_in$$dFunctor 71 $_in$$dCategory 61 $_in$$dMonad 61 $_in$$dEq 51 $_in$$dOrd 45 $_in$$dApplicative 44 <> 39 $_in$$dShow 38
reportTopX 10 (const True) id (view #module' <$> xs')
1408 Main 290 GHC.Base 151 GHC.Classes 63 Options.Applicative.Builder 63 GHC.Types 60 Control.Category 32 Optics.Internal.Generic 27 GHC.Show 23 Data.Functor 22
reportTopX 10 (const True) id (view #package <$> xs')
1408 base 348 main 290 ghc-prim 123 ptcs-cr-0.4.1.1-98cffce2 79 ptprs-pplctv-0.18.1.0-4b730158 78 fltprs-0.5.0.1-877b8a9e 37 ghc-9.8.1-a76c 31 nmhsk-0.11.1.0-faaa53a7 11 directory-1.3.8.1-46c1 9
reportTopX 10 (const True) id (rp' package' <$> (filter (/= "") (view #package <$> xs')))
base 348 main 290 ghc-prim 123 ptcs-cr 79 ptprs-pplctv 78 fltprs 37 ghc 31 nmhsk 11 directory 9 filepath 8
reportTopX 10 (const True) id (view #name <$> exs)
<> 39 $ 24 . 24 <$> 22 Options 22 IO 21 String 21 & 16 FilePath 14 Int 14
reportTopX 10 (not . FP.isLatinLetter . head) id (view #name <$> exs)
<interactive>:94:41: warning: [GHC-63394] [-Wx-partial] In the use of ‘head’ (imported from NumHask.Prelude, but defined in GHC.List): "This is a partial function, it throws an error on empty lists. Use pattern matching or Data.List.uncons instead. Consider refactoring to use Data.List.NonEmpty." <> 39 $ 24 . 24 <$> 22 & 16 $fIsLabelnameOptic 7 $fLabelOpticnamekstab 7 (%,,,,,,%) 7 <*> 7 $fEqChar 6
:set -Wno-x-partial
reportTopX 10 (\x -> ((CVars==) . fst) x && (Char.isLower . head . snd) x) printName allNames
CVars o 30 CVars a 26 CVars n 26 CVars irred 21 CVars x 19 CVars rule 16 CVars d 14 CVars xs 11 CVars f 10 CVars from 10
:set -Wno-x-partial
reportTopX 10 (\x -> ((CExt==) . fst) x && (Char.isLower . head . snd) x) printName allNames
CExt view 13 CExt fmap 12 CExt help 10 CExt long 10 CExt takeRest 9 CExt pure 8 CExt putStrLn 8 CExt printName 7 CExt reportTopX 7 CExt show 7
reportTopX 10 (Char.isUpper . head . snd) printName allNames
CCon Semigroup 78 CCon Functor 71 CCon Category 61 CCon Monad 61 CCon Eq 51 CCon Ord 45 CCon Applicative 44 CCon Show 38 CCon FromInteger 32 CCon HasName 30