Menu

Executive Programs

Workshops

Projects

Blogs

Careers

Student Reviews



More

Academic Training

Informative Articles

Find Jobs

We are Hiring!


All Courses

Choose a category

Loading...

All Courses

All Courses

logo

CSE

Modified on

29 Dec 2022 07:04 pm

How To Load Files Into Python?

logo

Skill-Lync

Data can be read from a number of sources, including files and databases, using Python. The.txt and.csv file types are two that are frequently utilised. Using either the Python CSV library or built-in import and export capabilities, you can import and export files. 

How to Import Data into Python? 

First we need to import the data into python. Importing the data as a dataframe helps us to handle the data much easily. The pandas module of Python is specially made to handle data as a data frame.

What are the Different Data Types in Python?

These are the 14 file types that can be opened by Pandas.

Comma-separated values (CSV)

pandas.read_csv(filepath_or_buffer, sep=NoDefault.no_default, delimiter=None, header='infer', names=NoDefault.no_default, index_col=None, usecols=None, squeeze=None, prefix=NoDefault.no_default, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=None, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors='strict', dialect=None, error_bad_lines=None, warn_bad_lines=None, on_bad_lines=None, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None, storage_options=None)

import pandas as pd

pd.read_csv('sometext.csv')

XLSX

pandas.read_excel(io, sheet_name=0, header=0, names=None, index_col=None, usecols=None, squeeze=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, parse_dates=False, date_parser=None, thousands=None, decimal='.', comment=None, skipfooter=0, convert_float=None, mangle_dupe_cols=True, storage_options=None)

import pandas as pd

pd.read_excel('someexcelfile.xlsx')

Zip file

import zipfile

import pandas as pd

# read the dataset using the compression zip

df = pd.read_csv('test.zip',compression='zip')

# display dataset

print(df.head())

Plain Text (txt)

# importing pandas

import pandas as pd  

# read text file into pandas DataFrame

df = pd.read_csv("gfg.txt", sep=" ")

# display DataFrame

print(df)

JSON

import pandas as pd

file_df = pd.read_json('E:/datasets/filename.json')

file_df.head()

XML

import xml.etree.ElementTree as ET

import pandas as pd

xml_data=open('filename.xml','r').read()

root=ET.XML(xml_data)  

data=[]

cols = []

for i, child in enumerate(root):

    data.append([subchild.text for subchild in child])

    cols.append(child.tag)

df = pd. DataFrame(data).T

df.columns=cols

print(df)

HTML

import pandas as pd

from unicodedata import normalize

table_MN=pd.read_html('https://en.wikipedia.org/wiki/something')

Images

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

#create an image

imar = np.array([[[1.,0.],[0.,0.]],

                 [[0.,1.],[0.,1.]],

                 [[0.,0.],[1.,1.]]]).transpose()

plt.imsave('pic.jpg', imar)

#create dataframe

df = pd.DataFrame([[0,""]], columns=["Feature1","Feature2"])

# read the image

im = plt.imread('pic.jpg')

plt.imshow(im)

plt.show()

Hierachial Data Format

import pandas as pd

subjectsdata = {'Name': ['sravan', 'sravan', 'sravan', 'sravan',

                         'sravan', 'sravan', 'sravan', 'sravan',

                         'Ojaswi', 'Ojaswi', 'Ojaswi', 'Ojaswi',

                         'Ojaswi', 'Ojaswi', 'Ojaswi', 'Ojaswi',

                         'Rohith', 'Rohith', 'Rohith', 'Rohith',

                         'Rohith', 'Rohith', 'Rohith', 'Rohith'],

                  

                'college': ['VFSTRU', 'VFSTRU', 'VFSTRU', 'VFSTRU',

                            'VFSTRU', 'VFSTRU', 'VFSTRU', 'VFSTRU',

                            'VIT', 'VIT', 'VIT', 'VIT', 'VIT', 'VIT',

                            'VIT', 'VIT', 'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu',

                            'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu',

                            'IIT-Bhu'],

                  

                'subject': ['java', 'dbms', 'dms', 'coa', 'python', 'dld',

                            'android', 'iot', 'java', 'dbms', 'dms', 'coa',

                            'python', 'dld', 'android', 'iot', 'java',

                            'dbms', 'dms', 'coa', 'python', 'dld', 'android',

                            'iot']

                }

df = pd.DataFrame(subjectsdata)

print(df)

PDF

from tabula import read_pdf

df = read_pdf('data.pdf')

DOCX

pip install pandas

pip install python-docx

import pandas as pd

from docx import Documentdocument = Document("<<docx file path>>")

data = [[cell.text for cell in row.cells] for row in table.rows]

df = pd.DataFrame(data)

MP3

import librosa

song_path = 'track1.mp3'

y,sr = librosa.load(song_path,sr=22050)

print(y)

This would be stored as a numpy array

MP4

import pylab

import imageio

filename = '/tmp/file.mp4'

vid = imageio.get_reader(filename,  'ffmpeg')

nums = [10, 287]

for num in nums:

    image = vid.get_data(num)

    fig = pylab.figure()

    fig.suptitle('image #{}'.format(num), fontsize=20)

    pylab.imshow(image)

pylab.show()

SQL

pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)


Author

author

Navin Baskar


Author

blogdetails

Skill-Lync

Subscribe to Our Free Newsletter

img

Continue Reading

Related Blogs

How do you connect to MS Excel using MySQL?

When analysing SQL data, Microsoft Excel can come into play as a very effective tool. Excel is instrumental in establishing a connection to a specific database that has been filtered to meet your needs. Through this process, you can now manipulate and report your SQL data, attach a table of data to Excel or build pivot tables.

CSE

08 Aug 2022


How to remove MySQL Server from your PC? A Stepwise Guide

Microsoft introduced and distributes the SQL Server, a relational database management system (RDBMS). SQL Server is based on SQL, a common programming language for communicating with relational databases, like other RDBMS applications.

CSE

23 Aug 2022


Introduction to Artificial Intelligence, Machine learning, and Deep Learning

Machine Learning is a process by which we train a device to learn some knowledge and use the awareness of that acquired information to make decisions. For instance, let us consider an application of machine learning in sales.

CSE

01 Jul 2022


Do Not Be Just Another Engineer: Four Tips to Enhance Your Engineering Career

Companies seek candidates who can differentiate themselves from the colossal pool of engineers. You could have a near-perfect CGPA and be a bookie, but the value you can provide to a company determines your worth.

CSE

04 Jul 2022


Cross-Validation Techniques For Data

Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.

CSE

27 Dec 2022



Author

blogdetails

Skill-Lync

Subscribe to Our Free Newsletter

img

Continue Reading

Related Blogs

How do you connect to MS Excel using MySQL?

When analysing SQL data, Microsoft Excel can come into play as a very effective tool. Excel is instrumental in establishing a connection to a specific database that has been filtered to meet your needs. Through this process, you can now manipulate and report your SQL data, attach a table of data to Excel or build pivot tables.

CSE

08 Aug 2022


How to remove MySQL Server from your PC? A Stepwise Guide

Microsoft introduced and distributes the SQL Server, a relational database management system (RDBMS). SQL Server is based on SQL, a common programming language for communicating with relational databases, like other RDBMS applications.

CSE

23 Aug 2022


Introduction to Artificial Intelligence, Machine learning, and Deep Learning

Machine Learning is a process by which we train a device to learn some knowledge and use the awareness of that acquired information to make decisions. For instance, let us consider an application of machine learning in sales.

CSE

01 Jul 2022


Do Not Be Just Another Engineer: Four Tips to Enhance Your Engineering Career

Companies seek candidates who can differentiate themselves from the colossal pool of engineers. You could have a near-perfect CGPA and be a bookie, but the value you can provide to a company determines your worth.

CSE

04 Jul 2022


Cross-Validation Techniques For Data

Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.

CSE

27 Dec 2022


Book a Free Demo, now!

Related Courses

https://d28ljev2bhqcfz.cloudfront.net/maincourse/thumb/introduction-hev-matlab-simulink_1612262875.jpg
Introduction to Hybrid Electric Vehicle using MATLAB and Simulink
4.8
23 Hours of content
Electrical Domain
Know more
https://d28ljev2bhqcfz.cloudfront.net/maincourse/thumb/vehicle-dynamics-matlab_1636606203.png
4.8
37 Hours of content
Cae Domain
Showing 1 of 4 courses