Overview
- Aim for perfect text and formatting (markdown instructions here).
- Please strive to understand every work and sentence you edit. प्रत्येकं वाक्यं व्याकरणशुद्ध्या ऽवगत्य शोधयितुं यतताम्।
- Quality over quantity. Don’t hurry - stay within daily agreed work limit. More daily work sometimes implies more errors.
- Follow multi-script proofreading.
Contribution levels
Our expectation: when we read the corrected text, we expect to have atleast the same experience as reading the original pdf (if not better). Even otherwise, if you leave the text in a significantly better state than earlier, it is valuable.
- Top level: Perfect text and formatting.
- Next level: Perfect text, with basic formatting. Reader won’t feel particular urge to consult the source most of the time.
- Next level: Almost perfect text (possibly missing diacritics and accents), with basic formatting (contiguous paragraphs, footnotes etc.).
- And so on.
We generlly expect top level contribution from paid proofreaders.
Often, OCR makes very few mistakes (<5) on a page if the print is good. It probably takes more human hours to add structure than to proofread. We should take structure seriously.
Multi-script proofreading
- The text should be examined in multiple scripts (as agreed earlier). Eg. devanAgarI and ISO, or devanAgarI and kannaDa. Script conversion may be accomplished using aksharamukhA.
- Problem being solved: In some scripts, it is hard to distinguish some characters; and mAtra-s may be missed. For example, in devanAgarI - ब व, म स; in IAST/ISO ki kī, ku kū, in kannaDa ದ ಧ ಥ, ಡ ಢ. A proofreader proficient in sanskrit once read “प्लोनेन परिमार्जनानन्तरं तदुक्तं पाद्यं चाधिकमुक्तम् । अत्रापि स एव हेतरनुमन्धेयः ।” as “प्लोतेन परिमार्जनानन्तरं तद् उक्तं, पाद्यं चाधिकम् उक्तम् । अत्रापि स एव हेतुर् अनुसन्धेयः ।”
Typing correct symbols
- Please use the correct symbols. Common mistakes: |(pipe) instead of ।(daNDa), :(colon) instead of visarga(ः), ०(शून्यम्) instead of ॰ (devanAgarI abbreviation sign).
Special characters
If you cannot type unusual unicode characters, copy them from here and paste.
-
IAST diacritics
- ā Ā ī Ī ū Ū ṛ Ṛ ṝ Ṝ ḷ Ḷ ḹ Ḹ
- ṃ Ṃ ḥ Ḥ
- ṅ Ṅ ñ Ñ
- ṭ Ṭ ḍ Ḍ
- ś Ś ṣ Ṣ
-
ISO
- ē ō r̥ r̥̄ l̥ l̥̄ ṁ
-
Vedic Svaras
॒ ॑
-
No harm using ISO instead of IAST - we can fix it later.
-
No harm ignoring initial letter capitalization (ie ṣ instead of Ṣ and so on).
Telugu
Certain defects are common in old sanskrit texts published in telugu script. Please use sanskrit knowledge to detect and correct those. If in doubt ask with screenshot.
One needs to remove fake spaces. For example ఆర్యమిశ్రాః instead of ఆర్యమి శ్రాః , and బహూనామాస్తికానామహాత్మనాం instead of బహూనా మా స్తికానామహాత్మనాం.
Also అథచాస్యాద ర్శా౯ - old telugu and kannaDa texts use something like ౯ (with with extra curls) for n - so it should be అథచాస్యాదర్శాన్.
Sometimes, instead of వో, they use a symbol like వేృ. So, this should be recognized as వో.
Telugu sanskrit books often use dh ध् instead of th थ् (and rarely vice-versa)- for example - గ్రంథోయం instead of గ్రంధోయం.
The following are often confused by proofreaders (so beware)-
- na, sa
- n-maatraa, m-maatraa
- v-maatraa, p-maatraa
- ड, द, ध्-maatraas and letters
ఙ ఞ - not used in common telugu, are used. So beware of mistaking those too.
In case of tamiL or maNipravALa texts in telugu script, printing ऴ् and ऱ् would be complicated.
Things to ignore
- Quotation mark placement which is not ‘bad’ as described in examples above - ie. don’t spend time trying to make it ‘best’.
- Empty spaces in lines. Don’t spend time correcting spaces like this.