Details

    • Type: Bug
    • Status: Accepted (View workflow)
    • Priority: major
    • Resolution: Unresolved
    • Affects versions: None
    • Fix versions: None
    • Components: to-assess
    • Labels:

      Description

      Different languages may have different conventions for titlecasing, such as
      Dutch "IJzer", or even English cases such as "Buffy the Vampire Slayer" (where
      "t" is not capitalized). Consider adding structure that allows implementations
      to do better titlecasing. (In general, there will still be cases that can't be
      handled algorithmically, but this would allow much better treatment than
      currently available.)

      One possibility: use the same structure as segmentation, whereby the text is
      segmented immediately prior to any point with initial capitals. Thus, for
      example, it would return "|I|Jzer" or "|Buffy the |Vampire |Slayer". That would
      allow us to use the same mechanisms as already exist.

        Attachments

          Issue links

            Activity

              People

              • Assignee:
                mark.edward.davis Mark Davis
                Reporter:
                apibot TracBot
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  tracCreated: