Our source document analysis proceeds by section which we have isolated into individual pages which we have diagrammed using this color code.
digraph { rankdir=LR node [shape=box style=filled] process [fillcolor=palegreen label="process\nopportunity"] technique [fillcolor=lightblue label="technique\ncomplexity"] responsibility [fillcolor=bisque label="responsibility\npace"] consensus [fillcolor=pink label="consensus\ngood"] norms [fillcolor=gold label="norms\nresults"] process -> technique -> responsibility -> norms technique -> consensus -> norms process -> consensus }
Click nodes to proceed.
We're surprised how many concepts have been represented in this diagram. Here we count them by fetching, parsing and then word counting the unique names we've used.
curl -s http://\ norms.ward.asia.wiki.org/\ xp-practice-network.json |\ jq -r '.story[1].text' | \ perl -ne 'print "$1\n$2\n" if /(\w+) -> (\w+)/' | \ sort | uniq | wc -w
We find 46 unique nodes and 63 connecting edges.
For comparison we find the lower case word frequency in the original document.
curl -s http://\ xpdx.org/\ xp-and-normative-good.json |\ jq -r '.story[].text' |\ perl -ne 'print "\L$_\n" for /\w+/g' |\ sort | uniq -c | sort -n
We manually edit this list to include only domain specific words that have been used more than once.
2 cleanliness 2 codebase 2 consensus 2 deliver 2 distribution 2 judgement 2 measure 2 opportunity 2 optimal 2 responsibility 2 statements 3 customers 3 development 3 functionality 3 good 3 team 4 future 4 process 5 extreme 5 programmers 6 code 6 experience 7 developers 7 programming 8 customer 8 work 9 velocity 11 decisions
We count 28 reused domain words in 14 paragraphs.