7 Ways of Reading a File in Python | by Nicholas Obert | Jul, 2022

Are you able to learn a file with out opening it?

Picture by Gianluca Cinnante on Unsplash

The Python programming language may be very well-known for its nice flexibility, each by way of syntax and how one can obtain a sure objective. In reality, there are numerous completely different strategies for a Python programmer to acquire the identical ultimate outcome, from utilizing fancy syntax to hacking built-in functionalities.

On this article, I’ll present you 7 methods of studying the content material of a file in Python.

The primary and most rudimental methodology of studying a file in Python is utilizing the built-in open() perform that returns a readable stream. Subsequently, you need to use the learn() methodology to learn the content material of the stream.

Say we now have a file known as file.txt in the identical listing of the Python script.

Okay, that was a bit banal and non-pythonic. Let’s have a look at a greater method.

That is the most typical method of studying a file in Python. The context supervisor prevents reminiscence leaks in case of errors whereas opening or studying from the file as it’s answerable for closing the file in any case.

This methodology isn’t essentially completely different from the old-school method, nevertheless it’s usually thought of the higher, cleaner, and safer method.

Now that we now have crossed out the “default” approaches, let’s transfer on to the extra unique methods of studying a file content material.

Builders say Python has a module for all the things. And, unsurprisingly, it has additionally a module for interacting with the file system known as pathlib.

Studying a file with pathlib is as straightforward as making a Path object and calling its read_text() methodology, no want to shut the stream manually.

Though very straightforward, you gained’t usually discover this method in Python codebases because it requires importing a module.

Subsequent up, we are going to check out the “hacks”, or the unusual, non-default methods of studying a file that obtain the top objective nonetheless.

In case you have ever labored with a UNIX-like working system, you’re definitely aware of the usual cat command. The title is definitely derived from the phrase “concatenate” and has nothing to do with cute pets.

The built-in subprocess module means that you can run shell instructions and deal with their output out of your Python script.

For these of you who personal a Home windows working system, you need to use the analogous kind command as a substitute… or simply swap to Linux.

Should you assume this fashion of studying a file is uncommon, tighten your belt: we’re getting much more unique.

What when you aren’t proud of Python’s built-in capabilities? You write your personal customized C extensions. To maintain the article concise, I’m not going to elucidate the complete course of of making a C extension for Python. Should you’re focused on studying extra about this subject, try this guide.

Initially, you must write the C extension:

Then you definitely create the setup.py script:

Lastly, you construct and set up the C extension:

python3 setup.py construct
python3 setup.py set up --user

Now you may import your customized C module in your Python scripts:

This was a bit overly sophisticated for simply studying the content material of a file, wasn’t it? The subsequent method gained’t include any C code, nevertheless it’s equally over-the-top.

We’ll use Python’s built-in flask module to run a small internet server to ship the file over HTTP.

Alternatively, you need to use the requests library to request the file content material from the server:

Fairly uncommon, proper? The subsequent one might be an actual hack.

Is it attainable to learn a file with out opening it? As you might need guessed, we’re going to strive all attainable byte combos and verify if any of them corresponds to the precise file content material. To attain this, nonetheless, we can’t open the file straight in Python, so we’re going to reap the benefits of the md5sum UNIX command that returns the md5 hash of a given file.

As soon as we now have the checksum, we start to brute-force till we discover the proper byte mixture.

This method is terrible and superior on the similar time. The utmost variety of combos it has to verify earlier than discovering the proper file content material is 256ⁿ the place n is the dimensions of the file in bytes and 256 is the variety of attainable combos in a single byte.

Irrespective of how inefficient this methodology is, your Python script will ultimately guess the file content material with out ever studying it, until your laptop dies first.

To wrap it up, there are numerous methods you may learn the content material of a file in Python. Some make sense and must be utilized in actual codebases, whereas others are esoteric hacks and are supposed to be only a proof-of-concept or a coding train.

I hope you loved this text. In case you have every other fancy method of studying from a file in Python, please share it in a remark. Thanks for studying!

If you wish to be taught extra about C extensions for Python, I counsel you check out this story beneath:

More Posts