PEP: | 278 |
---|---|
Title: | Universal Newline Support |
Author: | jack at cwi.nl (Jack Jansen) |
Status: | Final |
Type: | Standards Track |
Created: | 14-Jan-2002 |
Python-Version: | 2.3 |
Post-History: |
- The Universal Constant Mac Os X
- The Universal Constant Mac Os 11
- The Universal Constant Mac Os Catalina
Contents
This PEP discusses a way in which Python can support I/O on fileswhich have a newline format that is not the native format on theplatform, so that Python on each platform can read and importfiles with CR (Macintosh), LF (Unix) or CR LF (Windows) lineendings.
Universal newline support is implemented in C, not in Python. This is done because we want files with a foreign newline convention to be import-able, so a Python Lib directory can be shared over a remote file system connection, or between MacPython and Unix-Python on Mac OS X. You can also use Mouse Keys, a Universal Access feature in Mac OS X, to control mouse movement. Note that you cannot use Mouse Keys when you are using the NumPad Commander. Using sticky keys. The Sticky Keys feature treats a series of modifier key presses as a single key combination. For example, you can press Control and then Option, and your. Mac OS X Universal Logo Guidelines January 2006 Overview Applications designed for the Macintosh operating system (Mac OS X) that run natively on both PowerPC- and Intel-based Macintosh computers are called Universal applications. The Mac OS X Universal logo from Apple is designed to enable easy identification of these applications. A universal binary runs natively on both Apple silicon and Intel-based Mac computers, because it contains executable code for both architectures. Turn all of your compiled code into universal binaries, not just apps. The following list includes the most common types of executables to turn into universal binaries. (Note that older versions of Mac OS X can find this setting in System Preferences Universal Access checking 'Enable access for assistive devices') The list displayed shows exactly what apps can control the Mac using the Assistive Devices feature set.
It is more and more common to come across files that have an endof line that does not match the standard on the current platform:files downloaded over the net, remotely mounted filesystems on adifferent platform, Mac OS X with its double standard of Mac andUnix line endings, etc.
Many tools such as editors and compilers already handle thisgracefully, it would be good if Python did so too.
Universal newline support is enabled by default,but can be disabled during the configure of Python.
In a Python with universal newline support the feature isautomatically enabled for all import statements and execfile()calls. There is no special support for eval() or exec.
In a Python with universal newline support open() the modeparameter can also be 'U', meaning 'open for input as a text filewith universal newline interpretation'. Mode 'rU' is also allowed,for symmetry with 'rb'. Mode 'U' cannot becombined with other mode flags such as '+'. Any line ending in theinput file will be seen as a 'n' in Python, so little other code hasto change to handle universal newlines.
Conversion of newlines happens in all calls that read data: read(),readline(), readlines(), etc.
There is no special support for output to file with a differentnewline convention, and so mode 'wU' is also illegal.
A file object that has been opened in universal newline mode getsa new attribute 'newlines' which reflects the newline conventionused in the file. The value for this attribute is one of None (nonewline read yet), 'r', 'n', 'rn' or a tuple containing all thenewline types seen.
Universal newline support is implemented in C, not in Python.This is done because we want files with a foreign newlineconvention to be import-able, so a Python Lib directory can beshared over a remote file system connection, or between MacPythonand Unix-Python on Mac OS X. For this to be feasible theuniversal newline convention needs to have a reasonably smallimpact on performance, which means a Python implementation is notan option as it would bog down all imports. And because of fileswith multiple newline conventions, which Visual C++ and otherWindows tools will happily produce, doing a quick check for thenewlines used in a file (handing off the import to C code if aplatform-local newline is seen) will not work. Finally, a Cimplementation also allows tracebacks and such (which open thePython source module) to be handled easily.
There is no output implementation of universal newlines, Pythonprograms are expected to handle this by themselves or write fileswith platform-local convention otherwise. The reason for this isthat input is the difficult case, outputting different newlines toa file is already easy enough in Python.
Also, an output implementation would be much more difficult than aninput implementation, surprisingly: a lot of output is done throughPyXXX_Print() methods, and at this point the file object is notavailable anymore, only a FILE *. So, an output implementation wouldneed to somehow go from the FILE* to the file object, because thatis where the current newline delimiter is stored.
The input implementation has no such problem: there are no cases inthe Python source tree where files are partially read from C,partially from Python, and such cases are expected to be rare inextension modules. If such cases exist the only problem is that thenewlines attribute of the file object is not updated during thefread() or fgets() calls that are done direct from C.
A partial output implementation, where strings passed to fp.write()would be converted to use fp.newlines as their line terminator butall other output would not is far too surprising, in my view.
Because there is no output support for universal newlines there isalso no support for a mode 'rU+': the surprise factor of theprevious paragraph would hold to an even stronger degree.
There is no support for universal newlines in strings passed toeval() or exec. It is envisioned that such strings always have thestandard n line feed, if the strings come from a file that file canbe read with universal newlines.
I think there are no special issues with unicode. utf-16 shouldn'tpose any new problems, as such files need to be opened in binarymode anyway. Interaction with utf-8 is fine too: values 0x0a and 0x0dcannot occur as part of a multibyte sequence.
Universal newline files should work fine with iterators andxreadlines() as these eventually call the normal filereadline/readlines methods.
While universal newlines are automatically enabled for import theyare not for opening, where you have to specifically say open(...,'U'). This is open to debate, but here are a few reasons for thisdesign:
- Compatibility. Programs which already do their owninterpretation of rn in text files would break. Examples of suchprograms would be editors which warn you when you open a file witha different newline convention. If universal newlines was made thedefault such an editor would silently convert your line endings tothe local convention on save. Programs which open binary files astext files on Unix would also break (but it could be argued theydeserve it :-).
- Interface clarity. Universal newlines are only supported forinput files, not for input/output files, as the semantics wouldbecome muddy. Would you write Mac newlines if all reads so farhad encountered Mac newlines? But what if you then later read aUnix newline?
The newlines attribute is included so that programs that reallycare about the newline convention, such as text editors, canexamine what was in a file. They can then save (a copy of) thefile with the same newline convention (or, in case of a file withmixed newlines, ask the user what to do, or output in platformconvention).
Feedback is explicitly solicited on one item in the referenceimplementation: whether or not the universal newlines routinesshould grab the global interpreter lock. Currently they do not,but this could be considered living dangerously, as they maymodify fields in a FileObject. But as these routines arereplacements for fgets() and fread() as well it may be difficultto decide whether or not the lock is held when the routine iscalled. Moreover, the only danger is that if two threads read thesame FileObject at the same time an extraneous newline may be seenor the newlines attribute may inadvertently be set to mixed. Iwould argue that if you read the same FileObject in two threadssimultaneously you are asking for trouble anyway.
Note that no globally accessible pointers are manipulated in thefgets() or fread() replacement routines, just some integer-valuedflags, so the chances of core dumps are zero (he said:-).
Universal newline support can be disabled during configure because it doeshave a small performance penalty, and moreover the implementation hasnot been tested on all conceivable platforms yet. It might also be sillyon some platforms (WinCE or Palm devices, for instance). If universalnewline support is not enabled then file objects do not have the newlinesattribute, so testing whether the current Python has it can be done with asimple:
Note that this test uses the open() function rather than the filetype so that it won't fail for versions of Python where the filetype was not available (the file type was added to the built-innamespace in the same release as the universal newline feature wasadded).
Additionally, note that this test fails again on Python versions>= 2.5, when open() was made a function again and is not synonymouswith the file type anymore.
A reference implementation is available in SourceForge patch#476814: https://bugs.python.org/issue476814
This document has been placed in the public domain.
Many tools such as editors and compilers already handle thisgracefully, it would be good if Python did so too.
Universal newline support is enabled by default,but can be disabled during the configure of Python.
In a Python with universal newline support the feature isautomatically enabled for all import statements and execfile()calls. There is no special support for eval() or exec.
In a Python with universal newline support open() the modeparameter can also be 'U', meaning 'open for input as a text filewith universal newline interpretation'. Mode 'rU' is also allowed,for symmetry with 'rb'. Mode 'U' cannot becombined with other mode flags such as '+'. Any line ending in theinput file will be seen as a 'n' in Python, so little other code hasto change to handle universal newlines.
Conversion of newlines happens in all calls that read data: read(),readline(), readlines(), etc.
There is no special support for output to file with a differentnewline convention, and so mode 'wU' is also illegal.
A file object that has been opened in universal newline mode getsa new attribute 'newlines' which reflects the newline conventionused in the file. The value for this attribute is one of None (nonewline read yet), 'r', 'n', 'rn' or a tuple containing all thenewline types seen.
Universal newline support is implemented in C, not in Python.This is done because we want files with a foreign newlineconvention to be import-able, so a Python Lib directory can beshared over a remote file system connection, or between MacPythonand Unix-Python on Mac OS X. For this to be feasible theuniversal newline convention needs to have a reasonably smallimpact on performance, which means a Python implementation is notan option as it would bog down all imports. And because of fileswith multiple newline conventions, which Visual C++ and otherWindows tools will happily produce, doing a quick check for thenewlines used in a file (handing off the import to C code if aplatform-local newline is seen) will not work. Finally, a Cimplementation also allows tracebacks and such (which open thePython source module) to be handled easily.
There is no output implementation of universal newlines, Pythonprograms are expected to handle this by themselves or write fileswith platform-local convention otherwise. The reason for this isthat input is the difficult case, outputting different newlines toa file is already easy enough in Python.
Also, an output implementation would be much more difficult than aninput implementation, surprisingly: a lot of output is done throughPyXXX_Print() methods, and at this point the file object is notavailable anymore, only a FILE *. So, an output implementation wouldneed to somehow go from the FILE* to the file object, because thatis where the current newline delimiter is stored.
The input implementation has no such problem: there are no cases inthe Python source tree where files are partially read from C,partially from Python, and such cases are expected to be rare inextension modules. If such cases exist the only problem is that thenewlines attribute of the file object is not updated during thefread() or fgets() calls that are done direct from C.
A partial output implementation, where strings passed to fp.write()would be converted to use fp.newlines as their line terminator butall other output would not is far too surprising, in my view.
Because there is no output support for universal newlines there isalso no support for a mode 'rU+': the surprise factor of theprevious paragraph would hold to an even stronger degree.
There is no support for universal newlines in strings passed toeval() or exec. It is envisioned that such strings always have thestandard n line feed, if the strings come from a file that file canbe read with universal newlines.
I think there are no special issues with unicode. utf-16 shouldn'tpose any new problems, as such files need to be opened in binarymode anyway. Interaction with utf-8 is fine too: values 0x0a and 0x0dcannot occur as part of a multibyte sequence.
Universal newline files should work fine with iterators andxreadlines() as these eventually call the normal filereadline/readlines methods.
While universal newlines are automatically enabled for import theyare not for opening, where you have to specifically say open(...,'U'). This is open to debate, but here are a few reasons for thisdesign:
- Compatibility. Programs which already do their owninterpretation of rn in text files would break. Examples of suchprograms would be editors which warn you when you open a file witha different newline convention. If universal newlines was made thedefault such an editor would silently convert your line endings tothe local convention on save. Programs which open binary files astext files on Unix would also break (but it could be argued theydeserve it :-).
- Interface clarity. Universal newlines are only supported forinput files, not for input/output files, as the semantics wouldbecome muddy. Would you write Mac newlines if all reads so farhad encountered Mac newlines? But what if you then later read aUnix newline?
The newlines attribute is included so that programs that reallycare about the newline convention, such as text editors, canexamine what was in a file. They can then save (a copy of) thefile with the same newline convention (or, in case of a file withmixed newlines, ask the user what to do, or output in platformconvention).
Feedback is explicitly solicited on one item in the referenceimplementation: whether or not the universal newlines routinesshould grab the global interpreter lock. Currently they do not,but this could be considered living dangerously, as they maymodify fields in a FileObject. But as these routines arereplacements for fgets() and fread() as well it may be difficultto decide whether or not the lock is held when the routine iscalled. Moreover, the only danger is that if two threads read thesame FileObject at the same time an extraneous newline may be seenor the newlines attribute may inadvertently be set to mixed. Iwould argue that if you read the same FileObject in two threadssimultaneously you are asking for trouble anyway.
Note that no globally accessible pointers are manipulated in thefgets() or fread() replacement routines, just some integer-valuedflags, so the chances of core dumps are zero (he said:-).
Universal newline support can be disabled during configure because it doeshave a small performance penalty, and moreover the implementation hasnot been tested on all conceivable platforms yet. It might also be sillyon some platforms (WinCE or Palm devices, for instance). If universalnewline support is not enabled then file objects do not have the newlinesattribute, so testing whether the current Python has it can be done with asimple:
Note that this test uses the open() function rather than the filetype so that it won't fail for versions of Python where the filetype was not available (the file type was added to the built-innamespace in the same release as the universal newline feature wasadded).
Additionally, note that this test fails again on Python versions>= 2.5, when open() was made a function again and is not synonymouswith the file type anymore.
A reference implementation is available in SourceForge patch#476814: https://bugs.python.org/issue476814
This document has been placed in the public domain.
Source: https://github.com/python/peps/blob/master/pep-0278.txtYou can choose from icon, list, column, or Cover Flow view. In Cover Flow view, the browser is split horizontally into two sections. The top section is a graphical view of each item, such as folder icons or a preview of the first page of a document. The bottom section is a list view of the items.
To jump, press VO-J. If you're using VoiceOver gestures, keep a finger on the trackpad and press the Control key.
Icon view: Use the arrow keys to move to the item you want.
The Universal Constant Mac Os X
List view: To move down the list rows, press VO-Down Arrow. To expand and collapse a folder, press VO-. To move the VoiceOver cursor across a row and hear information about an item, press VO-Right Arrow. Or press VO-R to hear the entire row read at once.
Column view: To move down the list until you find the folder or file you want, use the Down Arrow key. To move into subfolders, press the Right Arrow key.
Cover Flow view: To flip through the items in the top section and move automatically through the corresponding list rows in the bottom section, press the Left Arrow or Right Arrow key.
The Universal Constant Mac Os 11
When you find the file or folder you want to open, use the Finder shortcut Command-O or Command-Down Arrow to open it.The Universal Constant Mac Os Catalina
VoiceOver announces when you have selected an alias or a file or folder you don't have permission to open.