Git replacing LF with CRLF

Questions : Git replacing LF with CRLF

On a Windows machine, I ran added some files using git add. I got warnings saying:

LF will be replaced by CRLF

What are the ramifications of this conversion?

Total Answers: 24 Answers 24


Popular Answers:

  1. If you already have checked out the code, the files are already indexed. After changing your Git settings, say by running:

    git config --global core.autocrlf input 

    You should refresh the indexes with

    git rm --cached -r . 

    And rewrite the Git index with

    git reset --hard 

    Note: this will remove your local changes. Consider stashing them before you do this.


    Source: Configuring Git to Handle Line Endings

  2. Both unix2dos and dos2unix is available on Windows with Git Bash. You can use the following command to perform UNIX (LF) → DOS (CRLF) conversion. Hence, you will not get the warning.

    unix2dos filename 

    or

    dos2unix -D filename 

    But, don’t run this command on any existing CRLF file, because then you will get empty newlines every second line.

    dos2unix -D filename will not work with every operating system. Please check this link for compatibility.

    If for some reason you need to force the command then use --force. If it says invalid then use -f.

  3. A GitHub article on line endings is commonly mentioned when talking about this topic.

    My personal experience with using the often recommended core.autocrlf configuration setting was very mixed.

    I’m using Windows with Cygwin, dealing with both Windows and Unix projects at different times. Even my Windows projects sometimes use Bash shell scripts, which require Unix (LF) line endings.

    Using GitHub’s recommended core.autocrlf setting for Windows, if I check out a Unix project (which does work perfectly on Cygwin – or maybe I’m contributing to a project that I use on my Linux server), the text files are checked out with Windows (CRLF) line endings, creating problems.

    Basically, for a mixed environment like I have, setting the global core.autocrlf to any of the options will not work well in some cases. This option might be set on a local (repository) Git configuration, but even that wouldn’t be good enough for a project that contains both Windows- and Unix-related stuff (e.g., I have a Windows project with some Bash utility scripts).

    The best choice I’ve found is to create per-repository .gitattributes files. The GitHub article mentions it.

    Example from that article:

    # Set the default behavior, in case people don't have core.autocrlf set. * text=auto # Explicitly declare text files you want to always be normalized and converted # to native line endings on checkout. *.c text *.h text # Declare files that will always have CRLF line endings on checkout. *.sln text eol=crlf # Denote all files that are truly binary and should not be modified. *.png binary *.jpg binary 

    In one of my project’s repository:

    * text=auto *.txt text eol=lf *.xml text eol=lf *.json text eol=lf *.properties text eol=lf *.conf text eol=lf *.awk text eol=lf *.sed text eol=lf *.sh text eol=lf *.png binary *.jpg binary *.p12 binary 

    It’s a bit more things to set up, but do it once per project, and any contributor on any OS should have no troubles with line endings when working with this project.

  4. I think Basiloungas’s answer is close, but out of date (at least on Mac).

    Open the ~/.gitconfig file and set safecrlf to false:

    [core] autocrlf = input safecrlf = false 

    That *will make it ignore the end of line char apparently (it worked for me, anyway).

  5. In Vim, open the file (e.g.: :e YOURFILEENTER), then

    :set noendofline binary :wq 
  6. I had this problem too.

    SVN doesn’t do any line ending conversion, so files are committed with CRLF line endings intact. If you then use git-svn to put the project into git then the CRLF endings persist across into the git repository, which is not the state git expects to find itself in – the default being to only have unix/linux (LF) line endings checked in.

    When you then check out the files on windows, the autocrlf conversion leaves the files intact (as they already have the correct endings for the current platform), however the process that decides whether there is a difference with the checked in files performs the reverse conversion before comparing, resulting in comparing what it thinks is an LF in the checked out file with an unexpected CRLF in the repository.

    As far as I can see your choices are:

    1. Re-import your code into a new git repository without using git-svn, this will mean line endings are converted in the intial git commit –all
    2. Set autocrlf to false, and ignore the fact that the line endings are not in git’s preferred style
    3. Check out your files with autocrlf off, fix all the line endings, check everything back in, and turn it back on again.
    4. Rewrite your repository’s history so that the original commit no longer contains the CRLF that git wasn’t expecting. (The usual caveats about history rewriting apply)

    Footnote: if you choose option #2 then my experience is that some of the ancillary tools (rebase, patch etc) do not cope with CRLF files and you will end up sooner or later with files with a mix of CRLF and LF (inconsistent line endings). I know of no way of getting the best of both.

  7. Removing the below from the ~/.gitattributes file,

    * text=auto

    will prevent Git from checking line-endings in the first place.

  8. Most of the tools in Windows also accepts a simple LF in text files. For example, you can control the behaviour for Visual Studio in a file named ‘.editorconfig’ with following example content (part):

     indent_style = space indent_size = 2 end_of_line = lf <<==== charset = utf-8 

    Only the original Windows Notepad does not work with LF, but there are some more proper simple editor tools available!

    Hence you should use LF in text files in Windows too. This is my message, and it is strongly recommended! There isn’t any reason to use CRLF in windows!

    (The same discussion is using in include paths in C/++. It is bovine fecal matter. Use #include <pathTo/myheader.h> with slash!, It is the C/++ standard and all Microsoft compilers support it).

    Hence the proper setting for Git is:

    git config core.autocrlf false 

    My message: Forget such old thinking programs as dos2unix and unix2dos. Clarify in your team that LF is proper to use under Windows.

  9. How to make Git ignore different line endings

    http://www.rtuin.nl/2013/02/how-to-make-git-ignore-different-line-endings/ (Not working)

    You can disable the CRLF behaviour completely, or per filetype by changing entries in your .gitattributes file. In my case, I put this:

    • -crlf This tells Git to ignore the line endings for all files. And does not change the files in your working directory. Even if you have the core.autocrlf set to true, false, or input.
    echo "* -crlf" > .gitattributes 

    Do this on a separate commit or Git might still see whole files as modified when you make a single change (depending on if you have changed the autocrlf option).

    This one really works. Git will respect the line endings in mixed line ending projects and not warn you about them.

  10. I don’t know much about Git on Windows, but…

    It appears to me that Git is converting the return format to match that of the running platform (Windows). CRLF is the default return format on Windows, while LF is the default return format for most other OSes.

    Chances are, the return format will be adjusted properly when the code is moved to another system. I also reckon Git is smart enough to keep binary files intact rather than trying to convert LFs to CRLFs in, say, JPEG files.

    In summary, you probably don’t need to fret too much over this conversion. However, if you go to archive your project as a tarball, fellow coders would probably appreciate having LF line terminators rather than CRLF. Depending on how much you care (and depending on you not using Notepad), you might want to set Git to use LF returns if you can 🙂

    Appendix: CR is ASCII code 13, LF is ASCII code 10. Thus, CRLF is two bytes, while LF is one.

  11. It should read:

    warning: (If you check it out/or clone to another folder with your current core.autocrlf being true,)LF will be replaced by CRLF

    The file will have its original line endings in your (current) working directory.

    This picture should explain what it means.

    Enter image description here

  12. Make sure that you have installed the latest version of Git

    I did as in a previous answer, git config core.autocrlf false, when using Git (version 2.7.1), but it did not work.

    Then it works now when upgrading git (from 2.7.1 to 2.20.1).

    1. Open the file in Notepad++.
    2. Go to menu EditEOL Conversion.
    3. Click on the Windows Format.
    4. Save the file.
  13. The OP’s question is Windows-related and I could not use others without going to the directory or even running file in Notepad++ as administrator did not work…

    So had to go this route:

    cd "C:Program Files (x86)Gitetc" git config --global core.autocrlf false 
  14. Many text editors allow you to change to LF. See the Atom instructions below. It is simple and explicit.


    Click CRLF on the bottom right:

    Enter image description here

    Select LF in dropdown on top:

    Enter image description here

  15. Other answers are fantastic for the general concept. I ran into a problem where after updating the warning still happened on existing repositories which had commits in previous setting.
  16. i/lf w/crlf attr/ src/components/quotes/ExQuoteForm.js i/lf w/lf attr/ src/components/quotes/HighlightedQuote.js
  17. In a GNU/Linux shell prompt, the dos2unix and unix2dos commands allow you to easily convert/format your files coming from MS Windows.

  18. CR and LF are a special set of characters that helps format our code.

    CR (/r) puts the cursor at the beginning of a line but doesn’t create a new line. This is how macOS works.

    LF (/n) creates a new line, but it doesn’t put the cursor at the beginning of that line. The cursor stays back at the end of the last line. This is how Unix and Linux work.

    CRLF (/r/f) creates a new line as well as puts the cursor at the beginning of the new line. This is how we see it in Windows OS.

    To summarize:

    1. LF (LINE FEED)
    • stands for Line Feed
    • denoted with /n
    • creates a new line in the code
    • The ASCII code is 10.
    • Used by Unix and other OSes based around it.
    1. CR (CARRIAGE RETURN)
    • stands for CARRIAGE RETURN
    • denoted with /r
    • puts the cursor on the beginning of a line.
    • The ASCII code is 13.
    • Used by macOS and its predecessors.
    1. CRLF (CARRIAGE RETURN AND LINE FEED)
    • stands for CARRIAGE RETURN and LINE FEED
    • denoted with /n/r
    • creates a new line and puts the cursor at the beginning of that new line.
    • The ASCII code is 10 for LF and 13 for CR.
    • Primarily used on Windows OS.

    Git uses LF by default. So when we use Git on Windows, it throws a warning like- “CRLF will be replaced by LF” and automatically converts all CRLF into LF, so that code becomes compatible.

    NB: Don’t worry…see this less as a warning and more as a notification message.

  19. I had the same issue, and doing git add . && git reset reverted all line endings correctly.

  20. Yet another TL;DR

    Iterator on list: next() returns the next element of the list

    Iterator generator: next() will compute the next element on the fly (execute code)

    You can see the yield/generator as a way to manually run the control flow from outside (like continue loop one step), by calling next, however complex the flow.

    Note: The generator is NOT a normal function. It remembers the previous state like local variables (stack). See other answers or articles for detailed explanation. The generator can only be iterated on once. You could do without yield, but it would not be as nice, so it can be considered ‘very nice’ language sugar.

  21. Here’s a simple yield based approach, to compute the fibonacci series, explained:

    def fib(limit=50): a, b = 0, 1 for i in range(limit): yield b a, b = b, a+b 

    When you enter this into your REPL and then try and call it, you’ll get a mystifying result:

    >>> fib() <generator object fib at 0x7fa38394e3b8> 

    This is because the presence of yield signaled to Python that you want to create a generator, that is, an object that generates values on demand.

    So, how do you generate these values? This can either be done directly by using the built-in function next, or, indirectly by feeding it to a construct that consumes values.

    Using the built-in next() function, you directly invoke .next/__next__, forcing the generator to produce a value:

    >>> g = fib() >>> next(g) 1 >>> next(g) 1 >>> next(g) 2 >>> next(g) 3 >>> next(g) 5 

    Indirectly, if you provide fib to a for loop, a list initializer, a tuple initializer, or anything else that expects an object that generates/produces values, you’ll “consume” the generator until no more values can be produced by it (and it returns):

    results = [] for i in fib(30): # consumes fib results.append(i) # can also be accomplished with results = list(fib(30)) # consumes fib 

    Similarly, with a tuple initializer:

    >>> tuple(fib(5)) # consumes fib (1, 1, 2, 3, 5) 

    A generator differs from a function in the sense that it is lazy. It accomplishes this by maintaining it’s local state and allowing you to resume whenever you need to.

    When you first invoke fib by calling it:

    f = fib() 

    Python compiles the function, encounters the yield keyword and simply returns a generator object back at you. Not very helpful it seems.

    When you then request it generates the first value, directly or indirectly, it executes all statements that it finds, until it encounters a yield, it then yields back the value you supplied to yield and pauses. For an example that better demonstrates this, let’s use some print calls (replace with print "text" if on Python 2):

    def yielder(value): """ This is an infinite generator. Only use next on it """ while 1: print("I'm going to generate the value for you") print("Then I'll pause for a while") yield value print("Let's go through it again.") 

    Now, enter in the REPL:

    >>> gen = yielder("Hello, yield!") 

    you have a generator object now waiting for a command for it to generate a value. Use next and see what get’s printed:

    >>> next(gen) # runs until it finds a yield I'm going to generate the value for you Then I'll pause for a while 'Hello, yield!' 

    The unquoted results are what’s printed. The quoted result is what is returned from yield. Call next again now:

    >>> next(gen) # continues from yield and runs again Let's go through it again. I'm going to generate the value for you Then I'll pause for a while 'Hello, yield!' 

    The generator remembers it was paused at yield value and resumes from there. The next message is printed and the search for the yield statement to pause at it performed again (due to the while loop).

  22. yield is similar to return. The difference is:

    yield makes a function iterable (in the following example primes(n = 1) function becomes iterable).
    What it essentially means is the next time the function is called, it will continue from where it left (which is after the line of yield expression).

    def isprime(n): if n == 1: return False for x in range(2, n): if n % x == 0: return False else: return True def primes(n = 1): while(True): if isprime(n): yield n n += 1 for n in primes(): if n > 100: break print(n) 

    In the above example if isprime(n) is true it will return the prime number. In the next iteration it will continue from the next line

    n += 1 
  23. In Python generators (a special type of iterators) are used to generate series of values and yield keyword is just like the return keyword of generator functions.

    The other fascinating thing yield keyword does is saving the state of a generator function.

    So, we can set a number to a different value each time the generator yields.

    Here’s an instance:

    def getPrimes(number): while True: if isPrime(number): number = yield number # a miracle occurs here number += 1 def printSuccessivePrimes(iterations, base=10): primeGenerator = getPrimes(base) primeGenerator.send(None) for power in range(iterations): print(primeGenerator.send(base ** power)) 
  24. In [4]: def make_cake(numbers): ...: for i in range(numbers): ...: yield 'Cake {}'.format(i) ...: In [5]: factory = make_cake(5)
  25. All of the answers here are great; but only one of them (the most voted one) relates to how your code works. Others are relating to generators in general, and how they work.

    So I won’t repeat what generators are or what yields do; I think these are covered by great existing answers. However, after spending few hours trying to understand a similar code to yours, I’ll break it down how it works.

    Your code traverse a binary tree structure. Let’s take this tree for example:

     5 /  3 6 /   1 4 8 

    And another simpler implementation of a binary-search tree traversal:

    class Node(object): .. def __iter__(self): if self.has_left_child(): for child in self.left: yield child yield self.val if self.has_right_child(): for child in self.right: yield child 

    The execution code is on the Tree object, which implements __iter__ as this:

    def __iter__(self): class EmptyIter(): def next(self): raise StopIteration if self.root: return self.root.__iter__() return EmptyIter() 

    The while candidates statement can be replaced with for element in tree; Python translate this to

    it = iter(TreeObj) # returns iter(self.root) which calls self.root.__iter__() for element in it: .. process element .. 

    Because Node.__iter__ function is a generator, the code inside it is executed per iteration. So the execution would look like this:

    1. root element is first; check if it has left childs and for iterate them (let’s call it it1 because its the first iterator object)
    2. it has a child so the for is executed. The for child in self.left creates a new iterator from self.left, which is a Node object itself (it2)
    3. Same logic as 2, and a new iterator is created (it3)
    4. Now we reached the left end of the tree. it3 has no left childs so it continues and yield self.value
    5. On the next call to next(it3) it raises StopIteration and exists since it has no right childs (it reaches to the end of the function without yield anything)
    6. it1 and it2 are still active – they are not exhausted and calling next(it2) would yield values, not raise StopIteration
    7. Now we are back to it2 context, and call next(it2) which continues where it stopped: right after the yield child statement. Since it has no more left childs it continues and yields it’s self.val.

    The catch here is that every iteration creates sub-iterators to traverse the tree, and holds the state of the current iterator. Once it reaches the end it traverse back the stack, and values are returned in the correct order (smallest yields value first).

    Your code example did something similar in a different technique: it populated a one-element list for every child, then on the next iteration it pops it and run the function code on the current object (hence the self).

    I hope this contributed a little to this legendary topic. I spent several good hours drawing this process to understand it.

  26. The yield keyword in Python used to exit from the code without disturbing the state of local variables and when again the function is called the execution starts from the last point where we left the code.

    The below example demonstrates the working of yield:

    def counter(): x=2 while x < 5: yield x x += 1 print("Initial value of x: ", counter()) for y in counter(): print(y) 

    The above code generates the Below output:

    Initial value of x: <generator object counter at 0x7f0263020ac0> 2 3 4 
  27. Can also send data back to the generator!

    Indeed, as many answers here explain, using yield creates a generator.

    You can use the yield keyword to send data back to a “live” generator.

    Example:

    Let’s say we have a method which translates from english to some other language. And in the beginning of it, it does something which is heavy and should be done once. We want this method run forever (don’t really know why.. :)), and receive words words to be translated.

    def translator(): # load all the words in English language and the translation to 'other lang' my_words_dict = {'hello': 'hello in other language', 'dog': 'dog in other language'} while True: word = (yield) yield my_words_dict.get(word, 'Unknown word...') 

    Running:

    my_words_translator = translator() next(my_words_translator) print(my_words_translator.send('dog')) next(my_words_translator) print(my_words_translator.send('cat')) 

    will print:

    dog in other language Unknown word... 

    To summarise:

    use send method inside a generator to send data back to the generator. To allow that, a (yield) is used.

  28. yield in python is in a way similar to the return statement, except for some differences. If multiple values have to be returned from a function, return statement will return all the values as a list and it has to be stored in the memory in the caller block. But what if we don’t want to use extra memory? Instead, we want to get the value from the function when we need it. This is where yield comes in. Consider the following function :-

    def fun(): yield 1 yield 2 yield 3 

    And the caller is :-

    def caller(): print ('First value printing') print (fun()) print ('Second value printing') print (fun()) print ('Third value printing') print (fun()) 

    The above code segment (caller function) when called, outputs :-

    First value printing 1 Second value printing 2 Third value printing 3 

    As can be seen from above, yield returns a value to its caller, but when the function is called again, it doesn’t start from the first statement, but from the statement right after the yield. In the above example, “First value printing” was printed and the function was called. 1 was returned and printed. Then “Second value printing” was printed and again fun() was called. Instead of printing 1 (the first statement), it returned 2, i.e., the statement just after yield 1. The same process is repeated further.

  29. Simple answer

    When function contains at least one yield statement, the function automaticly becomes generator function. When you call generator function, python executes code in the generator function until yield statement occur. yield statement freezes the function with all its internal states. When you call generator function again, python continues execution of code in the generator function from frozen position, until yield statement occur again and again. The generator function executes code until generator function runs out without yield statement.

    Benchmark

    Create a list and return it:

    def my_range(n): my_list = [] i = 0 while i < n: my_list.append(i) i += 1 return my_list @profile def function(): my_sum = 0 my_values = my_range(1000000) for my_value in my_values: my_sum += my_value function() 

    Results with:

    Total time: 1.07901 s Timer unit: 1e-06 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 9 @profile 10 def function(): 11 1 1.1 1.1 0.0 my_sum = 0 12 1 494875.0 494875.0 45.9 my_values = my_range(1000000) 13 1000001 262842.1 0.3 24.4 for my_value in my_values: 14 1000000 321289.8 0.3 29.8 my_sum += my_value Line # Mem usage Increment Occurences Line Contents ============================================================ 9 40.168 MiB 40.168 MiB 1 @profile 10 def function(): 11 40.168 MiB 0.000 MiB 1 my_sum = 0 12 78.914 MiB 38.746 MiB 1 my_values = my_range(1000000) 13 78.941 MiB 0.012 MiB 1000001 for my_value in my_values: 14 78.941 MiB 0.016 MiB 1000000 my_sum += my_value 

    Generate values on the fly:

    def my_range(n): i = 0 while i < n: yield i i += 1 @profile def function(): my_sum = 0 for my_value in my_range(1000000): my_sum += my_value function() 

    Results with:

    Total time: 1.24841 s Timer unit: 1e-06 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 7 @profile 8 def function(): 9 1 1.1 1.1 0.0 my_sum = 0 10 11 1000001 895617.3 0.9 71.7 for my_value in my_range(1000000): 12 1000000 352793.7 0.4 28.3 my_sum += my_value Line # Mem usage Increment Occurences Line Contents ============================================================ 7 40.168 MiB 40.168 MiB 1 @profile 8 def function(): 9 40.168 MiB 0.000 MiB 1 my_sum = 0 10 11 40.203 MiB 0.016 MiB 1000001 for my_value in my_range(1000000): 12 40.203 MiB 0.020 MiB 1000000 my_sum += my_value 

    Summary

    The generator function needs a little more time to execute, than function which returns a list but it use much less memory.

  30. A simple use case:

    >>> def foo(): yield 100 yield 20 yield 3 >>> for i in foo(): print(i) 100 20 3 >>>  

    How it works: when called, the function returns an object immediately. The object can be passed to the next() function. Whenever the next() function is called, your function runs up until the next yield and provides the return value for the next() function.

    Under the hood, the for loop recognizes that the object is a generator object and uses next() to get the next value.

    In some languages like ES6 and higher, it’s implemented a little differently so next is a member function of the generator object, and you could pass values from the caller every time it gets the next value. So if result is the generator then you could do something like y = result.next(555), and the program yielding values could say something like z = yield 999. The value of y would be 999 that next gets from the yield, and the value of z would be 555 that yield gets from the next. Python get and send methods have a similar effect.

  31. Usually, it’s used to create an iterator out of function. Think ‘yield’ as an append() to your function and your function as an array. And if certain criteria meet, you can add that value in your function to make it an iterator.

    arr=[] if 2>0: arr.append(2) def func(): if 2>0: yield 2 

    the output will be the same for both.

    The main advantage of using yield is to creating iterators. Iterators don’t compute the value of each item when instantiated. They only compute it when you ask for it. This is known as lazy evaluation.

  32. Generators allow to get individual processed items immediately (without the need to wait for the whole collection to be processed). This is illustrated in the example below.

    import time def get_gen(): for i in range(10): yield i time.sleep(1) def get_list(): ret = [] for i in range(10): ret.append(i) time.sleep(1) return ret start_time = time.time() print('get_gen iteration (individual results come immediately)') for i in get_gen(): print(f'result arrived after: {time.time() - start_time:.0f} seconds') print() start_time = time.time() print('get_list iteration (results come all at once)') for i in get_list(): print(f'result arrived after: {time.time() - start_time:.0f} seconds') 
    get_gen iteration (individual results come immediately) result arrived after: 0 seconds result arrived after: 1 seconds result arrived after: 2 seconds result arrived after: 3 seconds result arrived after: 4 seconds result arrived after: 5 seconds result arrived after: 6 seconds result arrived after: 7 seconds result arrived after: 8 seconds result arrived after: 9 seconds get_list iteration (results come all at once) result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds result arrived after: 10 seconds 
  33. The yield keyword is used in enumeration/iteration where the function is expected to return more then one output. I want to quote this very simple example A:

    # example A def getNumber(): for r in range(1,10): return r 

    The above function will return only 1 even when it’s called multiple times. Now if we replace return with yield as in example B:

    # example B def getNumber(): for r in range(1,10): yield r 

    It will return 1 when first called 2 when called again then 3,4 and it goes to increment till 10.

    Although the example B is conceptually true but to call it in python 3 we have to do the following:

     g = getNumber() #instance print(next(g)) #will print 1 print(next(g)) #will print 2 print(next(g)) #will print 3 # so to assign it to a variables v = getNumber() v1 = next(v) #v1 will have 1 v2 = next(v) #v2 will have 2 v3 = next(v) #v3 will have 3 
  34. names = ['Sam', 'Sarah', 'Thomas', 'James'] # Using function def greet(name) : return f'Hi, my name is {name}.' for each_name in names: print(greet(each_name)) # Output: >>>Hi, my name is Sam. >>>Hi, my name is Sarah. >>>Hi, my name is Thomas. >>>Hi, my name is James. # using generator def greetings(names) : for each_name in names: yield f'Hi, my name is {each_name}.' for greet_name in greetings(names): print (greet_name) # Output: >>>Hi, my name is Sam. >>>Hi, my name is Sarah. >>>Hi, my name is Thomas. >>>Hi, my name is James.
  35. Key points

    • The grammar for Python uses the presence of the yield keyword to make a function that returns a generator.

    • A generator is a kind of iterator, which is that main way that looping occurs in Python.

    • A generator is essentially a resumable function. Unlike return that returns a value and ends a function, the yield keyword returns a value and suspends a function.

    • When next(g) is called on a generator, the function resumes execution where it left off.

    • Only when the function encounters an explicit or implied return does it actually end.

    Technique for writing and understanding generators

    An easy way to understand and think about generators is to write a regular function with print() instead of yield:

    def f(n): for x in range(n): print(x) print(x * 10) 

    Watch what it outputs:

    >>> f(3) 0 0 1 10 2 2 

    When that function is understood, substitute the yield for print to get a generator that produces the same values:

    def f(n): for x in range(n): yield x yield x * 10 

    Which gives:

    >>> list(f(3)) [0, 0, 1, 10, 2, 20] 

    Iterator protocol

    The answer to “what yield does” can be short and simple, but it is part of a larger world, the so-called “iterator protocol”.

    On the sender side of iterator protocol, there are two relevant kinds of objects. The iterables are things you can loop over. And the iterators are objects that track the loop state.

    On the consumer side of the iterator protocol, we call iter() on the iterable object to get a iterator. Then we call next() on the iterator to retrieve values from the iterator. When there is no more data, a StopIteration exception is raised:

    >>> s = [10, 20, 30] # The list is the "iterable" >>> it = iter(s) # This is the "iterator" >>> next(it) # Gets values out of an iterator 10 >>> next(it) 20 >>> next(it) 30 >>> next(it) Traceback (most recent call last): ... StopIteration 

    To make this all easier for us, for-loops call iter and next on our behalf:

    >>> for x in s: ...  print(x) ...  10 20 30 

    A person could write a book about all this, but these are the key points. When I teach Python courses, I’ve found that this is a minimal sufficient explanation to build understand and start using it right away. In particular, the trick of writing a function with print, testing it, and then converting to yield seems to work well with all levels of Python programmers.

  36. To understand its yield function, one must understand what a generator is. Moreover, before understanding generators, you must understand iterables. Iterable: iterable To create a list, you naturally need to be able to read each element one by one. The process of reading its items one by one is called iteration:

    >>> mylist = [1, 2, 3] >>> for i in mylist: ...  print(i) 1 2 3 

    mylist is an iterable. When you use list comprehensions, you create a list and therefore iterable:

    >>> mylist = [x*x for x in range(3)] >>> for i in mylist: ...  print(i) 0 1 4 

    All data structures that can be used for… in… are iterable; lists, strings, files…

    These iterable methods are convenient because you can read them at will, but you store all the values ​​in memory, which is not always desirable when you have many values. Generator: generator A generator is also a kind of iterator, a special kind of iteration, which can only be iterated once. The generator does not store all values ​​in memory, but generates values ​​on the fly:

    generator: generator, generator, generator generates electricity but does not store energy;)

    >>> mygenerator = (x*x for x in range(3)) >>> for i in mygenerator: ...  print(i) 0 1 4 

    As long as () is used instead of [], the list comprehension becomes the generator comprehension. However, since the generator can only be used once, you cannot execute for i in mygenerator a second time: the generator calculates 0, then discards it, then calculates 1, and the last time it calculates 4. The typical black blind man breaks corn.

    The yield keyword is used in the same way as return, except that the function will return the generator.

    >>> def createGenerator(): ...  mylist = range(3) ...  for i in mylist: ...  yield i*i ... >>> mygenerator = createGenerator() >>> print(mygenerator) <generator object createGenerator at 0xb7555c34> >>> for i in mygenerator: ...  print(i) 0 1 4 

    This example itself is useless, but when you need a function to return a large number of values ​​and only need to read it once, using yield becomes convenient.

    To master the yield, one need to be clear is that when a function is called, the code written in the function body will not run. The function only returns the generator object. Beginners are likely to be confused about this.

    Second, understand that the code will continue from where it left off every time for uses the generator.

    The most difficult part now is:

    The first time for calls the generator object created from your function, it will run the code in the function from the beginning until it hits yield, and then it will return the first value of the loop. Then, each subsequent call will run the next iteration of the loop you wrote in the function and return the next value. This will continue until the generator is considered empty, which yields when there is no hit while the function is running. That may be because the loop has ended, or because you are no longer satisfied with “if/else”.

    Personal understanding I hope to help you!

Tasg: linux, windows