Skip to content

Commit 8ad75bc

Browse files
committed
Downrank canon tag matches
1 parent 1e7dcd7 commit 8ad75bc

7 files changed

Lines changed: 20 additions & 11 deletions

File tree

‎_content/articles/lao-buddhist-women_tsomo-karma-lekshe.md‎

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ course: theravada
1111
tags:
1212
- laotian
1313
- nuns
14-
- bhikkhuni
1514
- gender
1615
year: 2010
1716
journal: bsr

‎_content/monographs/sapiens_harari-y.md‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ olid: OL28326205M
77
course: world
88
tags:
99
- evolution
10+
- past
1011
- power
1112
year: 2011
1213
publisher: "Kinneret Zmora-Bitan Dvir"

‎_data/content.yml‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,4 @@ ccmm: 1.12 # category match multiplier
2525
scmm: 1.08 # similar category multiplier
2626
fcmm: 1.2 # featured content multiplier
2727
tsmm: 0.5 # two star content multiplier
28+
ctmm: 0.5 # downweight canon book tag matches

‎_plugins/similar_content.rb‎

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,15 @@ class SimilarContentFooterTag < Liquid::Tag
88
@@similar_cats = nil
99
@@content = nil
1010
@@content_for_tag = nil
11+
@@canon_tags = nil
1112

1213
def dataInit(v)
1314
puts "Prefetching data for similar_content footer..."
1415
@@config = v["site.data.content"]
1516
@@parents_for_tag = {}
1617
@@similar_cats = {}
1718
@@content = []
19+
@@canon_tags = Set.new
1820
@@content_for_tag = Hash.new { |h, k| h[k] = Set.new }
1921
for cofefe in v["site.categories"]
2022
cat = cofefe.to_liquid.to_h
@@ -23,6 +25,9 @@ def dataInit(v)
2325
for cofefe in v["site.tags"]
2426
tag = cofefe.to_liquid.to_h
2527
@@parents_for_tag[tag["slug"]] = tag["parents"]
28+
if tag["is_canon"]
29+
@@canon_tags << tag["slug"]
30+
end
2631
end
2732
for cofefe in v["site.content"]
2833
c = cofefe.to_liquid.to_h
@@ -133,16 +138,21 @@ def render(v)
133138
end
134139
if candidate["tags"]&.size&.nonzero? then
135140
denom += candidate["tags"].size * @@config["tdms"]
136-
for t in include_content["tags"]
137-
if candidate["tags"].include? t then
138-
score += @@config["ttms"]
141+
for t in candidate["tags"]
142+
ttms = @@config["ttms"]
143+
if @@canon_tags.include? t and candidate["category"] == "canon"
144+
ttms *= @@config["ctmm"]
145+
denom -= @@config["tdms"] * (1.0 - @@config["ctmm"])
146+
end
147+
if include_content["tags"].include? t then
148+
score += ttms
139149
else
140150
if @@parents_for_tag[t] then
141151
for p in @@parents_for_tag[t]
142-
if candidate["tags"].include? p then
152+
if include_content["tags"].include? p then
143153
score += @@config["tpms"]
144154
break
145-
elsif candidate["course"] == p then
155+
elsif include_content["course"] == p then
146156
score += @@config["tpms"]
147157
break
148158
end

‎_tags/canonical-poetry.md‎

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "Poetry of the Pāḷi Canon"
33
status: unpublished
44
parents: [sutta]
5-
is_canon: true # changes the layout slightly
65
---
76

87
Mostly for the canonical poetry collections of the Kn, but also for the miscellaneous poetry found throughout The Canon.

‎_tags/rebirth-stories.md‎

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "Canonical Rebirth Stories"
33
status: unpublished
44
parents: [indian, sutta]
5-
is_canon: true # changes the layout slightly
65
---
76

87
Rebirth stories in the Pāḷi Canon.

‎_tests/similar_content.md‎

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ partial_cases:
6262
- [mindfulness-in-plain-english_gunaratana, how-to-meditate_yuttadhammo]
6363
---
6464

65-
A series of test cases to measure the quality of the content recommendations algorithm.
65+
A series of hand-labeled examples to estimate the quality of the content recommendations algorithm.
6666

6767
The passing rate at the bottom is a general indicator of the algo's recall rate. It's not meant to be 100%.
6868

@@ -76,7 +76,7 @@ The passing rate at the bottom is a general indicator of the algo's recall rate.
7676

7777
| Test Name | Status | Notes |
7878
|-----------|---------|--------|{% for test in cases %}{% assign fc = site.content | find: "slug", test[0] %}{% assign sc = site.content | find: "slug", test[1] %}{% capture cont_req %}{% assign include_content = fc %}{% similar_content %}{% endcapture %}
79-
| "[{{ fc.title | split: ':' | first }}]({{ fc.url }})" should recommend "[{{ sc.title | split: ':' | first }}]({{ sc.url }})" <details><code>{{ cont_req | strip_html | strip_newlines }}</code></details> | {% if cont_req contains test[1] %}{% assign succs = succs | plus: 1 %}Pass ✅{% else %}{% assign fails = fails | plus: 1 %}FAIL ❌{% endif %} | of {% assign c = cont_req | split: "</li>" | size | minus: 2 %}{% assign simcount = simcount | plus: c %}{{ c }} |{% endfor %}
79+
| "[{{ fc.title | split: ':' | first }}]({{ fc.url }})" should recommend "[{{ sc.title | split: ':' | first }}]({{ sc.url }})" <details><code>{{ cont_req | strip_html | strip_newlines }}</code></details> | {% if cont_req contains test[1] %}{% assign succs = succs | plus: 1 %}Included{% else %}{% assign fails = fails | plus: 1 %}Not included{% endif %} | of {% assign c = cont_req | split: "</li>" | size | minus: 2 %}{% assign simcount = simcount | plus: c %}{{ c }} |{% endfor %}
8080
|-----|------|----|
81-
| Totals: | {{ succs }} Pass and {{ fails }} Failed | {{ simcount }} total recommendations |
81+
| Totals: | {{ succs }} included out of {% assign total = fails | plus: succs %}{{total}} cases ({{ succs | times: 100 | divided_by: total }}%) | {{ simcount }} total recommendations |
8282

0 commit comments

Comments
 (0)