-
-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Font and text overhaul #30161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
QuLogic
wants to merge
79
commits into
matplotlib:main
Choose a base branch
from
QuLogic:text-overhaul-figures
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Font and text overhaul #30161
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix center of rotation with rotation_mode='anchor'
Glyph indices are specific to each font. It does not make sense to fall back based on glyph index to another font. This could only really be populated by calling `FT2Font.set_text`, but even that was fragile. If a fallback font was used for a character with the same glyph index as a previous character in the main font, then these lookups could be overwritten to the fallback instead of the main font, with a completely different character! Fortunately, nothing actually uses or requires a fallback through glyph indices.
Remove ttconv backwards-compatibility code
Remove fallback code for glyph indices
045897c to
e4be26c
Compare
e4be26c to
2b3f5c5
Compare
Member
Author
|
Also, if you would like to follow along with the figure changes, I've posted a branch that does the changes per merge commit: https://github.com/QuLogic/matplotlib/tree/text-overhaul-figures-per-commit |
1 task
This allows checking that there are no _new_ failures, without committing the new figures to the repo until the branch is complete.
ci: Preload existing test images from text-overhaul-figures branch
Also, check some expected conditions at parse time instead of somewhere during use of the data.
ci: Fix image preload with multiple conflicts
Add typing to AFM parser
2b3f5c5 to
b17bef1
Compare
ft2font: Split layouting from set_text
Add os.PathLike support to FT2Font constructor, and FontManager
16619c6 to
2117a71
Compare
If the larger glyphs for an auto-sized character in `cmex10` uses a character that is in the `latex_to_bakoma` table, then it will be mapped an extra time into `cmr10` (usually). Thus we end up with a large version of a "normal" character, such as an exclamation point. Instead map these glyphs through the `latex_to_bakoma` table by using their glyph names as "commands". This ensures they don't get double-mapped to the wrong font and fixes the following issues: - slash (/) uses a comma at the larger sizes - right parenthesis uses an exclamation point at the largest size - left and right braces use parentheses at the largest size - right floor uses a percentage sign at the largest size - left ceiling uses an ampersand at the largest size Also, drop the regular size braces, as they are the same as the first `big`-sized version.
Fix auto-sized glyphs with BaKoMa fonts
For character codes outside the embedded font limits (256 for type 3 and 65536 for type 42), we output them as XObjects instead of using text commands. But there is nothing in the PDF spec that requires any specific encoding like this. Since we now support subsetting all fonts before embedding, split each font into groups based on the maximum character code (e.g., 256-entry groups for type 3), then switch text strings to a different font subset and re-map character codes to it when necessary. This means all text is true text (albeit with some strange encoding), and we no longer need any XObjects for glyphs. For users of non-English text, this means it will become selectable and copyable again. Fixes matplotlib#21797
For Type 3 fonts, add a `ToUnicode` mapping (which was added in PDF 1.2), and for Type 42 fonts, correct the Unicode encoding, which should be UTF-16BE, not UCS2.
These characters are outside the BMP and should test subset splitting for type 42 output in PDF.
2117a71 to
d56b646
Compare
pdf: Improve text with characters outside embedded font limits
d56b646 to
9703849
Compare
No need to repeat the calculation of subset blocks, but instead offload it to `track_glyph`.
Instead of splitting fonts into `subset_size` blocks and writing text as character code modulo `subset_size`, compress the blocks by doing two things: 1. Preserve the character code if it lies in the first block. This keeps ASCII (for Type 3) and the Basic Multilingual Plane (for Type 42) as their normal codes. 2. Push everything else into the next spot in the next block, splitting by `subset_size` as necessary. This should reduce the number of additional font subsets to embed.
If mixing languages, sometimes a single character may use different glyphs in one document. In that case, we need to give it a new character code in the next subset, since subset 0 is preserving character codes.
For ligatures or complex shapings, multiple characters may map to a single glyph. In this case, we still want to output a single character code for the string using the font subset, but the `ToUnicode` map should give back all the characters.
Previously, this was supposed to "upgrade" type 3 to type 42 if the number of glyphs overflowed. However, as `CharacterTracker` can suggest a new subset for other reasons (i.e., multiple glyphs for the same character or a glyph for multiple characters may go to a second subset), we do need proper subset handling here as well. Since that is now done, we can drop the "promotion" from type 3 to type 42, as we don't get too many glyphs in each embedded font.
Prepare `CharacterTracker` for advanced font features
Font features allow font designers to provide alternate glyphs or shaping within a single font. These features may be accessed via special tags corresponding to internal tables of glyphs. The mplcairo backend supports font features via an elaborate re-use of the font file path [1]. This commit adds the API to make this officially supported in the main user API. [1] https://github.com/matplotlib/mplcairo/blob/v0.6.1/README.rst#font-formats-and-features
Add font feature API to Text
Previously, in a mathtext string like `r"$\sin x$"`, a thin space would
(correctly) be added between "sin" and "x", but that space would be
missing in expressions like `r"$\max f$"`. The difference arose because
of the slightly different handling of subscripts and superscripts
after the `\sin` and `\max` operators: `\sin^n` puts the superscript as
a normal exponent, but `\max_x` puts the subscript centered below the
operator name ("overunder symbol). The previous code for inserting the
thin space did not handle the "overunder" case; fix that. The new
behavior is tested by the change in test_operator_space, as well as by
mathtext1_dejavusans_06.
The change in mathtext_foo_29 arises because the extra thin space now
inserted after `\limsup` slightly shifts the centering of the whole
string. Ideally that thin space should be suppressed if there's no
token after the operator, but that's not something currently implemented
either for e.g. `\sin` (compare e.g. the right-alignments in
`text(.5, .9, r"$\sin$", ha="right"); text(.5, .8, r"$\mathrm{sin}$", ha="right"); axvline(.5)`
where the extra thin space after `\sin` is visible), so this patch just
makes things more consistent.
Rename _in_subscript_or_superscript to the more descriptive _needs_space_after_subsuper; simplify its setting in operatorname(); avoid the need to introduce an extra explicitly-typed spaced_nucleus variable.
51d80cc to
ca9c54c
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR summary
This PR is intended to hold all font and text PRs from the project Font and text overhaul
In order to not overwhelm the main repo with the churn of test image replacements, this PR comes from my fork and should only ever have 1 commit more than the text-overhaul branch with the changes to test images.
PR checklist