Modified on
29 Dec 2022 07:04 pm
Skill-Lync
Data can be read from a number of sources, including files and databases, using Python. The.txt and.csv file types are two that are frequently utilised. Using either the Python CSV library or built-in import and export capabilities, you can import and export files.
First we need to import the data into python. Importing the data as a dataframe helps us to handle the data much easily. The pandas module of Python is specially made to handle data as a data frame.
These are the 14 file types that can be opened by Pandas.
pandas.read_csv(filepath_or_buffer, sep=NoDefault.no_default, delimiter=None, header='infer', names=NoDefault.no_default, index_col=None, usecols=None, squeeze=None, prefix=NoDefault.no_default, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=None, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors='strict', dialect=None, error_bad_lines=None, warn_bad_lines=None, on_bad_lines=None, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None, storage_options=None)
import pandas as pd
pd.read_csv('sometext.csv')
pandas.read_excel(io, sheet_name=0, header=0, names=None, index_col=None, usecols=None, squeeze=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, parse_dates=False, date_parser=None, thousands=None, decimal='.', comment=None, skipfooter=0, convert_float=None, mangle_dupe_cols=True, storage_options=None)
import pandas as pd
pd.read_excel('someexcelfile.xlsx')
import zipfile
import pandas as pd
# read the dataset using the compression zip
df = pd.read_csv('test.zip',compression='zip')
# display dataset
print(df.head())
# importing pandas
import pandas as pd
# read text file into pandas DataFrame
df = pd.read_csv("gfg.txt", sep=" ")
# display DataFrame
print(df)
import pandas as pd
file_df = pd.read_json('E:/datasets/filename.json')
file_df.head()
import xml.etree.ElementTree as ET
import pandas as pd
xml_data=open('filename.xml','r').read()
root=ET.XML(xml_data)
data=[]
cols = []
for i, child in enumerate(root):
data.append([subchild.text for subchild in child])
cols.append(child.tag)
df = pd. DataFrame(data).T
df.columns=cols
print(df)
import pandas as pd
from unicodedata import normalize
table_MN=pd.read_html('https://en.wikipedia.org/wiki/something')
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#create an image
imar = np.array([[[1.,0.],[0.,0.]],
[[0.,1.],[0.,1.]],
[[0.,0.],[1.,1.]]]).transpose()
plt.imsave('pic.jpg', imar)
#create dataframe
df = pd.DataFrame([[0,""]], columns=["Feature1","Feature2"])
# read the image
im = plt.imread('pic.jpg')
plt.imshow(im)
plt.show()
import pandas as pd
subjectsdata = {'Name': ['sravan', 'sravan', 'sravan', 'sravan',
'sravan', 'sravan', 'sravan', 'sravan',
'Ojaswi', 'Ojaswi', 'Ojaswi', 'Ojaswi',
'Ojaswi', 'Ojaswi', 'Ojaswi', 'Ojaswi',
'Rohith', 'Rohith', 'Rohith', 'Rohith',
'Rohith', 'Rohith', 'Rohith', 'Rohith'],
'college': ['VFSTRU', 'VFSTRU', 'VFSTRU', 'VFSTRU',
'VFSTRU', 'VFSTRU', 'VFSTRU', 'VFSTRU',
'VIT', 'VIT', 'VIT', 'VIT', 'VIT', 'VIT',
'VIT', 'VIT', 'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu',
'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu',
'IIT-Bhu'],
'subject': ['java', 'dbms', 'dms', 'coa', 'python', 'dld',
'android', 'iot', 'java', 'dbms', 'dms', 'coa',
'python', 'dld', 'android', 'iot', 'java',
'dbms', 'dms', 'coa', 'python', 'dld', 'android',
'iot']
}
df = pd.DataFrame(subjectsdata)
print(df)
from tabula import read_pdf
df = read_pdf('data.pdf')
pip install pandas
pip install python-docx
import pandas as pd
from docx import Documentdocument = Document("<<docx file path>>")
data = [[cell.text for cell in row.cells] for row in table.rows]
df = pd.DataFrame(data)
import librosa
song_path = 'track1.mp3'
y,sr = librosa.load(song_path,sr=22050)
print(y)
This would be stored as a numpy array
import pylab
import imageio
filename = '/tmp/file.mp4'
vid = imageio.get_reader(filename, 'ffmpeg')
nums = [10, 287]
for num in nums:
image = vid.get_data(num)
fig = pylab.figure()
fig.suptitle('image #{}'.format(num), fontsize=20)
pylab.imshow(image)
pylab.show()
pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)
Author
Navin Baskar
Author
Skill-Lync
Subscribe to Our Free Newsletter
Continue Reading
Related Blogs
When analysing SQL data, Microsoft Excel can come into play as a very effective tool. Excel is instrumental in establishing a connection to a specific database that has been filtered to meet your needs. Through this process, you can now manipulate and report your SQL data, attach a table of data to Excel or build pivot tables.
08 Aug 2022
Microsoft introduced and distributes the SQL Server, a relational database management system (RDBMS). SQL Server is based on SQL, a common programming language for communicating with relational databases, like other RDBMS applications.
23 Aug 2022
Machine Learning is a process by which we train a device to learn some knowledge and use the awareness of that acquired information to make decisions. For instance, let us consider an application of machine learning in sales.
01 Jul 2022
Companies seek candidates who can differentiate themselves from the colossal pool of engineers. You could have a near-perfect CGPA and be a bookie, but the value you can provide to a company determines your worth.
04 Jul 2022
Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.
27 Dec 2022
Author
Skill-Lync
Subscribe to Our Free Newsletter
Continue Reading
Related Blogs
When analysing SQL data, Microsoft Excel can come into play as a very effective tool. Excel is instrumental in establishing a connection to a specific database that has been filtered to meet your needs. Through this process, you can now manipulate and report your SQL data, attach a table of data to Excel or build pivot tables.
08 Aug 2022
Microsoft introduced and distributes the SQL Server, a relational database management system (RDBMS). SQL Server is based on SQL, a common programming language for communicating with relational databases, like other RDBMS applications.
23 Aug 2022
Machine Learning is a process by which we train a device to learn some knowledge and use the awareness of that acquired information to make decisions. For instance, let us consider an application of machine learning in sales.
01 Jul 2022
Companies seek candidates who can differentiate themselves from the colossal pool of engineers. You could have a near-perfect CGPA and be a bookie, but the value you can provide to a company determines your worth.
04 Jul 2022
Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.
27 Dec 2022
Related Courses