How to type U+2028 in MS Word
Tópico cartaz: Samuel Murray
Samuel Murray
Samuel Murray  Identity Verified
Holanda
Local time: 08:27
Membro (2006)
inglês para africâner
+ ...
Feb 4, 2013

G'day everyone

I'm editing an XML file in MS Word as plain UTF8 text, and the text contains "line separator" characters (U+2028). MS Word displays these characters as large spaces but I can't copy/paste them directly. Is there a way that I can use advanced search/replace to find and/or replace these characters?

Thanks
Samuel


 
esperantisto
esperantisto  Identity Verified
Local time: 09:27
Membro (2006)
inglês para russo
+ ...
SITE LOCALIZER
Use Apache OpenOffice Feb 5, 2013

In Apache OpenOffice you can do it: copy the symbol to clipboard, paste it into the search field and do find&replace.

 
Samuel Murray
Samuel Murray  Identity Verified
Holanda
Local time: 08:27
Membro (2006)
inglês para africâner
+ ...
CRIADOR(A) DO TÓPICO
Thanks Feb 5, 2013

esperantisto wrote:
In Apache OpenOffice you can do it: copy the symbol to clipboard, paste it into the search field and do find&replace.


Thanks -- I used LibreOffice to replace all such characters with something else, temporarily.


 
Rolf Keller
Rolf Keller
Alemanha
Local time: 08:27
inglês para alemão
Use the regular expressions feature? Feb 5, 2013

Samuel Murray wrote:

Is there a way that I can use advanced search/replace to find and/or replace these characters?


I assume that all these separators are between XML-elements, i. e. between angle brackets. If so you could try to search for them using Word's placeholder feature.

E. g. search for "\)?\(" and replace it by ")something("

Note: Because ProZ' software doesn't like angle brackets, I've used round brackets. Please change accordingly.

[Bearbeitet am 2013-02-05 12:42 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Holanda
Local time: 08:27
Membro (2006)
inglês para africâner
+ ...
CRIADOR(A) DO TÓPICO
The character occurs alone Feb 5, 2013

Rolf Keller wrote:
I assume that all these separators are between XML-elements, i. e. between angle brackets.


Alas, no. They occur in various places in the file, even outside translatable text. I suspect the original file (an IDML file) that the XML file (a TXML file) was created from had line breaks which was retained in the XML file as character U+2028.

I can copy this character in MS Word but I can't use it in the find/replace dialog. I can copy/paste it in one or two text editors (not all, even). I tried MS Word's find dialog's Unicode character longhand, as "U^2028", but MS Word doesn't recognise the character.

For the most part I need not include these characters in the translation, but there are a number of cases where they clearly represent a non-optional linebreak, where they must be retained.


 
Alexander C. Thomson
Alexander C. Thomson  Identity Verified
Holanda
Local time: 08:27
holandês para inglês
+ ...
Another possible way Feb 5, 2013

Looks like youʼve got it sorted already but another possible tip for such situations is to turn on Show Paragraph Marks from Home>Formatting (the ‘backwards P’ button), highlight one of the characters you want removed or replaced (possibly by navigating the cursor to the start of the line on which they occur and using Shift+Down Arrow to select that line and hence that character), and using a Ctrl + H command to simply replace them.

Alex


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to type U+2028 in MS Word






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »