Introduction to Data Analysis & Python
INTRODUCTION TO DATA ANALYSIS AND PYTHON
What is Data Analysis?
Data Analysis:
- Data analysis is inspecting and modeling data to derive meaningful conclusions.
- Data analysis is the process of collecting, cleaning, transforming, and modeling data to extract useful insights, draw conclusions, and support decision-making.
- It typically involves statistical and logical techniques for exploring and interpreting data to summarize and visualize raw data.
- The goal is to uncover trends, patterns, and insights that help solve problems or guide future actions.
- It plays a crucial role in a wide range of fields like business, science, healthcare, finance, and government.
Data Analysis Process:
- Data Collection: Data collection is the first step in the data analysis process. In this phase, relevant and accurate data is gathered from various sources such as databases, surveys, APIs, web scraping, or sensors. It is important to ensure that the data is complete, trustworthy, and suitable for the analysis objectives. Proper documentation of data sources is also essential for reproducibility and transparency.
- Data Cleaning: Once the data is collected, the next step is data cleaning. This involves correcting or removing inaccurate, incomplete, or irrelevant parts of the dataset. Common cleaning tasks include removing duplicate entries, handling missing values, standardizing formats (such as dates and text), and identifying outliers or inconsistent values. Clean data is crucial for producing reliable and accurate analysis results.
- Data Exploration: Data exploration, or EDA(Exploratory Data Analysis), is used to understand the structure and characteristics of the dataset. During this phase, analysts use summary statistics and visualizations to identify patterns, spot anomalies, and explore relationships between variables. This step often includes calculating measures such as mean, median, and standard deviation, and creating visualizations like histograms, box plots, and scatter plots. EDA helps form hypotheses and guides further analysis.
- Data Analysis: In the data analysis phase, various techniques are applied to extract meaningful insights and answer specific questions. This can involve statistical methods such as hypothesis testing or more advanced techniques like machine learning models (e.g., regression, classification, or clustering). The goal is to uncover trends, test assumptions, and make data-driven decisions based on the findings.
- Data Interpretation: After conducting the analysis, the next step is to interpret the results. This means understanding what the analysis reveals about the original question or problem. It involves explaining the significance of the findings, identifying implications, and recognizing any limitations or assumptions made during the analysis. Interpretation bridges the gap between raw data output and actionable insight.
- Data Visualization: The final step is data visualization, which focuses on presenting the results in a clear and compelling way. Charts, graphs, dashboards, and other visual tools are used to communicate the findings to stakeholders, whether technical or non-technical. Effective data visualization makes it easier to understand complex data and supports informed decision-making. Good visualization practices include clarity, simplicity, and relevance to the target audience.
Types of Data Analysis:
Data Analysis Applications:
Advantages of Data Analysis:
-
Enhanced Decision-Making
Data analysis enables decisions based on factual insights rather than intuition, improving accuracy and reducing uncertainty. -
Improved Operational Efficiency
It identifies inefficiencies and streamlines workflows, leading to increased productivity and cost savings. -
Better Customer Understanding
Data analysis provides you with more insights into your customers, allowing you to tailor customer service to their needs, provide more personalization, and build stronger relationships with them. -
Competitive Advantage
Utilizing data analysis allows businesses to stay ahead of competitors by identifying emerging trends, anticipating market shifts, and adapting strategies accordingly. -
Risk Mitigation
As businesses are vulnerable to risk and fraud, Early detection of potential issues allows proactive measures, reducing exposure to financial or operational risks. -
Cost Reduction
With the help of advanced technologies such as predictive analytics, businesses can spot improvement opportunities, trends, and patterns in their data and plan their strategies accordingly. In time, this will help save money and resources on implementing the wrong strategies. -
Innovation and Product Development
Data analysis can inspire innovation by revealing unmet customer needs and identifying areas for product or service improvement. By understanding customer preferences and market trends, businesses can develop new products and services that better meet customer demands. -
Performance Monitoring
Data analysis enables businesses to track key performance indicators (KPIs) and monitor overall performance. This allows for timely identification of areas where performance is lagging and enables businesses to make necessary adjustments to improve results.
Disadvantages of Data Analysis:
Differences between Data Analysis and Analytics
1. Focus:
- Data Analysis primarily focuses on inspecting, cleaning, transforming, and modeling data to extract useful information. It helps draw conclusions and supports decision-making based on historical data. It focuses on examining historical and current data to understand what has already happened.
- Analytics, on the other hand, goes beyond just examining historical data. It uses that information to predict future trends, prescribe actions, and optimize processes for better decision-making. Data analysis is like looking at a snapshot of the past, seeing what happened, and finding patterns. On the other hand, analytics is more like using that snapshot to predict what might happen in the future.
2. Scope:
- Data Analysis typically involves examining historical data to identify patterns, trends, and relationships within a specific dataset.
- Analytics has a broader scope. It includes predictive (forecasting future events) and prescriptive (recommending actions) capabilities to support strategic planning.
3. Techniques:
- Data Analysis employs traditional statistical methods, visualization tools, and exploratory techniques to understand the characteristics of data.
- Analytics utilizes advanced statistical models, machine learning algorithms, and data mining techniques to uncover patterns and make predictions about future events.
4. Outcome:
- The main outcome of Data Analysis is to generate actionable insights and information that help in making informed decisions.
- The primary outcome of Analytics is to provide deeper, actionable insights that not only support decisions but also aim to improve future outcomes through proactive strategies.
5. Example:
Analyzing sales performance from the previous quarter.
Identifying patterns in customer complaints to find recurring issues.
Visualizing demographic trends in customer data.
Forecasting future sales based on past trends.
Recommending personalized products using customer behavior data.
Optimizing supply chains through predictive maintenance models.
What is Python?
Python is a powerful open source, high – level, object – oriented programming language created by “Guido Van Rossum” and first released in 1991. It is further developed by the Python Software Foundation. It is one of the widely used popular and powerful programming languages. It has simple easy – to – use syntax, making it the perfect language for someone trying to learn computer programming for the first time, and it is also a good language to have in any programmer’s stack as it can be used for everything from web development to software development and scientific applications.
Features of Python:
1) Easy to Learn and Use
- Python is easy to learn as compared to other programming languages.
- Its syntax is straightforward and much the same as the English language.
- There is no use of the semicolon or curly-bracket, the indentation defines the code block.
- Python can perform complex tasks using a few lines of code.
- A simple example, the hello world program you simply type print("Hello World").
- It will take only one line to execute, while Java or C takes multiple lines.
3) Interpreted Language
- Python is an interpreted language; it means the Python program is executed one line at a time.
- The advantage of being interpreted language, it makes debugging easy and portable.
4) Cross-platform Language
- Python can run equally on different platforms such as Windows, Linux, UNIX, and Macintosh, etc.
- So, we can say that Python is a portable language.
- It enables programmers to develop the software for several competing platforms by writing a program only once.
- Python is freely available for everyone.
- It is freely available on its official website www.python.org.
- It has a large community across the world that is dedicatedly working towards make new python modules and functions.
- The open-source means, "Anyone can download its source code without paying any penny."
- Python supports object-oriented language and concepts of classes and objects come into existence.
- It supports inheritance, polymorphism, and encapsulation, etc.
- The object-oriented procedure helps to programmer to write reusable code and develop applications in less code.
- It implies that other languages such as C/C++ can be used to compile the code and thus it can be used further in our Python code.
- It converts the program into byte code, and any platform can use that byte code.
- It provides a vast range of libraries for the various fields such as machine learning, web developer, and also for the scripting.
- There are various machine learning libraries, such as Tensor flow, Pandas, Numpy, Keras, and Pytorch, etc.
- Django, flask, pyramids are the popular framework for Python web development.
- Graphical User Interface is used for the developing Desktop application.
- PyQT5, Tkinter, Kivy are the libraries which are used for developing the web application.
- It can be easily integrated with languages like C, C++, and JAVA, etc.
- Python runs code line by line like C,C++ Java. It makes easy to debug the code.
- The code of the other programming language can use in the Python source code.
- We can use Python source code in another programming language as well.
- In Python, we don't need to specify the data-type of the variable.
- When we assign some value to the variable, it automatically allocates the memory to the variable at run time.
- Suppose we are assigned the integer value 15 to x, then we don't need to write int x = 15. Just write x = 15.
Advantages of Python:
- Ease of learning and use: Python's code looks like English, so it is simple to read and write. This makes it easy for beginners to start learning programming.
- Versatile: Python can be used for many different types of work, like creating websites, analyzing data, making games, automating tasks, and more.
- Large Number of Libraries and Frameworks: Python has many libraries and frameworks that help you do complex tasks easily—like data handling, machine learning, and data visualization.
- Strong Community Support: A large group of Python users and developers around the world who share tutorials, documentation, and third-party tools.
Disadvantages of python:
- Performance: Python runs slower than some other programming languages like C or C++ because it is an interpreted language, not a compiled one.
- Mobile App Development: Python is not used much for creating mobile apps. Other languages like Java or Kotlin are more common for mobile development.
- Runtime Errors: Python is dynamically typed, which means it checks variable types while running the program. This can lead to errors that only appear during execution (not before), which may affect stability.
Applications of python:
Why Python for Data Analysis?
1. Ease of Learning and Use
-
Python’s syntax is simple, clean, and readable, resembling plain English.
-
It’s accessible to beginners and professionals alike, even those without a strong programming background.
-
The learning curve is gentle, allowing analysts to focus more on data insights rather than complex programming logic.
2. Rich Ecosystem of Data Analysis Libraries
Python offers a wide range of specialized libraries that streamline data analysis:
| Library | Purpose |
|---|---|
| Pandas | Data manipulation and analysis (e.g., data frames, filtering, aggregation) |
| NumPy | Efficient numerical computing and array operations |
| SciPy | Scientific and technical computing (e.g., linear algebra, optimization) |
| Matplotlib / Seaborn | Data visualization (charts, graphs, heatmaps) |
| Scikit-learn | Predictive modeling and analysis |
3. Flexibility and Interoperability
-
Python can handle various data formats, including CSV, Excel, JSON, SQL databases, and APIs.
-
It integrates well with other tools and languages such as SQL, R, Java, Hadoop, and Spark.
-
Python can be used for both small-scale analysis (e.g., local datasets) and large-scale data processing.
4. Strong Community and Support
-
Python has a large, global community of developers, data analysts, and data scientists.
-
Thousands of forums, tutorials, blogs, GitHub repositories, and Stack Overflow discussions exist for problem-solving and learning.
-
The community actively contributes to open-source libraries, ensuring continuous improvement and innovation.
5. Seamless Integration with Machine Learning
-
Python is a leading language for machine learning (ML) and artificial intelligence (AI).
-
Popular ML libraries include:
-
Scikit-learn – For traditional ML algorithms like regression, classification, clustering.
-
TensorFlow / PyTorch – For deep learning, neural networks, and complex models.
-
-
Analysts can easily scale from basic descriptive analysis to predictive analytics and advanced ML models within the same ecosystem.
6. Open Source and Free to Use
-
Python is completely open-source and free to download, use, and modify.
-
This reduces the cost of entry, making it accessible to individuals, students, startups, and organizations.
-
No expensive licenses or subscriptions are needed for core functionality.
7. Cross-Platform Compatibility
-
Python is platform-independent — it works on Windows, macOS, Linux, and other operating systems without major changes in code.
-
This makes it easier to develop, test, and deploy data analysis projects across different environments.
8. Industry Adoption
Python has gained widespread adoption across industries, including technology, finance, healthcare, marketing, and research.
-
Its versatility and ecosystem make it the go-to tool for data tasks in almost every industry.
What is a Library?
A library in programming is a collection of pre-written code modules. These modules contain a set of functions and routines that help in performing specific tasks or solving common problems.
Libraries make software development easier by providing reusable code components. This helps developers avoid writing code from scratch and saves both time and effort. Developers can integrate these libraries into different programs or projects easily.
Libraries can perform many tasks, such as:- Mathematical operations
- File handling
- Network communication
- Data manipulation
- User interface design
- And many other programming tasks
By using libraries, developers can avoid repetitive coding and focus more on building the main features of their applications.
Libraries can be available in different forms:- Source code form, which developers can compile and link
- Pre-compiled binary files, which can be directly used in a program
Python uses tools like pip to install libraries. These libraries are usually distributed as packages.
Definition of a Software Library:A software library is a collection of non-volatile resources used by computer programs to help in software development. These resources can include:- Configuration data
- Documentation
- Help data
- Message templates
- Pre-written code and subroutines
- Classes
- Values
- Type specifications
Purpose of Software Libraries:The main purpose of a software library is to provide a standard way to perform common programming tasks. These tasks may include:- Input/output operations
- String manipulation
- Data storage
- Using complex algorithms
By reusing code from libraries, developers can reduce development time, improve the quality of their code, and add more features easily.Advantages:
- Code Reusability: This allows developers to reuse code, reducing duplicates and fostering efficient development.
- Reliability: Libraries are usually tested and maintained by experienced developers or communities. This makes them more reliable and less prone to errors than custom-written code.
- Time-Saving: Libraries help speed up the development process by offering ready-made functions and classes. This saves developers from writing everything from scratch.
Disadvantages:
- Reliance on External Libraries: If a program depends too much on external libraries, it may face problems when a library becomes outdated or is no longer supported.
- Overhead: Including large libraries for small tasks can unnecessarily increase the size of an application and slow down its loading time.
- Learning Curve: Understanding and using certain libraries correctly can take extra effort and learning time, especially for beginners.
Applications:
Software libraries are widely used in different areas of software development. Some common applications are:- Web Development: Libraries like jQuery make it easier to work with HTML documents, handle events, and perform tasks like AJAX interactions and DOM manipulation.
- Data Analysis: Libraries such as Pandas and NumPy offer powerful tools to handle, process, and analyze large amounts of data.
- Machine Learning: Libraries like TensorFlow and Scikit-learn provide tools for data mining, building machine learning models, and performing data analysis.
Essential Python Libraries
- Numpy (Numerical Python)
- NumPy is a foundational library for numerical computing in Python.
- It provides powerful N-dimensional(1-D, 2-D, 3-D) array objects for efficient data storage and manipulation.
- It offers a wide range of mathematical functions (square, sqrt), linear algebra operations(dot,transpose), and random number generation.
Advantages:- NumPy allows efficient numerical operations on large datasets.
- It supports N-dimensional array operations for scientific computing.
- It integrates well with other libraries for data analysis and machine learning.
Applications:- Pandas
- Built on top of NumPy, specifically designed for data analysis and manipulation.
- Introduces DataFrames, which are tabular data structures with labeled rows and columns, similar to spreadsheets.
- Enables easy data loading, cleaning, exploring, transforming, and merging.
Advantages:- Tabular data manipulation with DataFrames.
- Easy handling of missing data and data alignment.
- Integration with databases and time series data.
Applications:- Matplotlib
- Cornerstone library for data visualization in Python.
- Capable of creating static, animated, and interactive visualizations (line plots, scatter plots, histograms, bar charts, heatmaps, etc.).
- Offers extensive customization options for fine-tuning all visualization elements.
Advantages:- Creation of a wide variety of static and interactive plots.
- Fine-grained control over plot aesthetics.
- Seamless integration with Jupyter notebooks.
Applications:- Data visualization for exploratory data analysis.
- Presentation-quality plots for reports and publications.
- Educational purposes in teaching and tutorials.
- Seaborn
- Built on top of Matplotlib; provides a high-level interface for creating aesthetically pleasing statistical graphics.
- Simplifies the creation of common plots like heatmaps, violin plots, joint plots, and more.
Advantages:- Simplifies the creation of complex statistical plots.
- Aesthetic enhancements for Matplotlib plots.
- Integration with Pandas for easy data manipulation.
Applications:- Statistical data visualization.
- Exploration of complex datasets.
- Presenting the results of statistical analyses.
- Scikit-learn
- Versatile machine learning library.
- Supports classification, regression, clustering, dimensionality reduction, model selection, and preprocessing.
- User-friendly API with consistent syntax.
Advantages:- Consistent and simple API for various machine learning algorithms.
- Integration with other libraries like NumPy and Pandas.
- Robust tools for model selection and evaluation.
Applications:- Classification, regression, and clustering tasks.
- Dimensionality reduction and feature extraction.
- Model selection and evaluation in machine learning pipelines.
- Statsmodels
- Focused on statistical modeling and analysis.
- Provides tools for model fitting, hypothesis testing, and time series analysis.
- Works well with NumPy and Pandas.
Advantages:- Emphasis on statistical models and hypothesis testing.
- Integration with Pandas for data handling.
- Comprehensive tools for linear and non-linear models.
Applications:- Statistical analysis and hypothesis testing.
- Econometrics and financial modelling.
- Time series analysis and forecasting.
- Requests
- Simplifies making HTTP requests and interacting with web APIs.
Advantages:- Simplifies HTTP requests in Python.
- Versatile for various HTTP methods.
- Support for handling authentication and cookies.
Applications:- Web scraping and data extraction.
- Interaction with web APIs to fetch data.
- Automated testing of web services.
- Beautiful Soup
- Parses HTML and XML documents.
- Ideal for web scraping.
Advantages:- HTML and XML parsing for web scraping.
- Navigating and searching parsed tree structures.
- Integration with other libraries for data extraction.
Applications:- Web scraping and data mining.
- Extracting structured information from websites.
- Automating the parsing of HTML and XML documents.
- Plotly
- Creates interactive, web-based visualizations.
- Supports zooming, panning, and hovering.
Advantages:- Creation of interactive, web-based visualizations.
- Support for a wide range of chart types.
- Integration with Jupyter notebooks and online hosting.
Applications:- Building interactive dashboards.
- Exploratory data analysis with interactive plots.
- Web-based data visualization applications.
- Tensor Flow and PyTorch
- Powerful libraries for deep learning and AI.
- Used to build neural networks and solve complex tasks like image recognition and NLP.
Advantages:- Powerful frameworks for deep learning and neural networks.
- Support for GPU acceleration.
- Extensive community support and pre-trained models.
Applications:- Image and speech recognition.
- Natural language processing.
- Training and deploying deep learning models for various tasks.
- Mathematical operations
- File handling
- Network communication
- Data manipulation
- User interface design
- And many other programming tasks
- Source code form, which developers can compile and link
- Pre-compiled binary files, which can be directly used in a program
- Configuration data
- Documentation
- Help data
- Message templates
- Pre-written code and subroutines
- Classes
- Values
- Type specifications
- Input/output operations
- String manipulation
- Data storage
- Using complex algorithms
Advantages:
- Code Reusability: This allows developers to reuse code, reducing duplicates and fostering efficient development.
- Reliability: Libraries are usually tested and maintained by experienced developers or communities. This makes them more reliable and less prone to errors than custom-written code.
- Time-Saving: Libraries help speed up the development process by offering ready-made functions and classes. This saves developers from writing everything from scratch.
Disadvantages:
- Reliance on External Libraries: If a program depends too much on external libraries, it may face problems when a library becomes outdated or is no longer supported.
- Overhead: Including large libraries for small tasks can unnecessarily increase the size of an application and slow down its loading time.
- Learning Curve: Understanding and using certain libraries correctly can take extra effort and learning time, especially for beginners.
Applications:
- Web Development: Libraries like jQuery make it easier to work with HTML documents, handle events, and perform tasks like AJAX interactions and DOM manipulation.
- Data Analysis: Libraries such as Pandas and NumPy offer powerful tools to handle, process, and analyze large amounts of data.
- Machine Learning: Libraries like TensorFlow and Scikit-learn provide tools for data mining, building machine learning models, and performing data analysis.
Essential Python Libraries
- Numpy (Numerical Python)
- NumPy is a foundational library for numerical computing in Python.
- It provides powerful N-dimensional(1-D, 2-D, 3-D) array objects for efficient data storage and manipulation.
- It offers a wide range of mathematical functions (square, sqrt), linear algebra operations(dot,transpose), and random number generation.
- NumPy allows efficient numerical operations on large datasets.
- It supports N-dimensional array operations for scientific computing.
- It integrates well with other libraries for data analysis and machine learning.
- Pandas
- Built on top of NumPy, specifically designed for data analysis and manipulation.
- Introduces DataFrames, which are tabular data structures with labeled rows and columns, similar to spreadsheets.
- Enables easy data loading, cleaning, exploring, transforming, and merging.
- Tabular data manipulation with DataFrames.
- Easy handling of missing data and data alignment.
- Integration with databases and time series data.
- Matplotlib
- Cornerstone library for data visualization in Python.
- Capable of creating static, animated, and interactive visualizations (line plots, scatter plots, histograms, bar charts, heatmaps, etc.).
- Offers extensive customization options for fine-tuning all visualization elements.
- Creation of a wide variety of static and interactive plots.
- Fine-grained control over plot aesthetics.
- Seamless integration with Jupyter notebooks.
- Data visualization for exploratory data analysis.
- Presentation-quality plots for reports and publications.
- Educational purposes in teaching and tutorials.
- Seaborn
- Built on top of Matplotlib; provides a high-level interface for creating aesthetically pleasing statistical graphics.
- Simplifies the creation of common plots like heatmaps, violin plots, joint plots, and more.
- Simplifies the creation of complex statistical plots.
- Aesthetic enhancements for Matplotlib plots.
- Integration with Pandas for easy data manipulation.
- Statistical data visualization.
- Exploration of complex datasets.
- Presenting the results of statistical analyses.
- Scikit-learn
- Versatile machine learning library.
- Supports classification, regression, clustering, dimensionality reduction, model selection, and preprocessing.
- User-friendly API with consistent syntax.
- Consistent and simple API for various machine learning algorithms.
- Integration with other libraries like NumPy and Pandas.
- Robust tools for model selection and evaluation.
- Classification, regression, and clustering tasks.
- Dimensionality reduction and feature extraction.
- Model selection and evaluation in machine learning pipelines.
- Statsmodels
- Focused on statistical modeling and analysis.
- Provides tools for model fitting, hypothesis testing, and time series analysis.
- Works well with NumPy and Pandas.
- Emphasis on statistical models and hypothesis testing.
- Integration with Pandas for data handling.
- Comprehensive tools for linear and non-linear models.
- Statistical analysis and hypothesis testing.
- Econometrics and financial modelling.
- Time series analysis and forecasting.
- Requests
- Simplifies making HTTP requests and interacting with web APIs.
- Simplifies HTTP requests in Python.
- Versatile for various HTTP methods.
- Support for handling authentication and cookies.
- Web scraping and data extraction.
- Interaction with web APIs to fetch data.
- Automated testing of web services.
- Beautiful Soup
- Parses HTML and XML documents.
- Ideal for web scraping.
- HTML and XML parsing for web scraping.
- Navigating and searching parsed tree structures.
- Integration with other libraries for data extraction.
- Web scraping and data mining.
- Extracting structured information from websites.
- Automating the parsing of HTML and XML documents.
- Plotly
- Creates interactive, web-based visualizations.
- Supports zooming, panning, and hovering.
- Creation of interactive, web-based visualizations.
- Support for a wide range of chart types.
- Integration with Jupyter notebooks and online hosting.
- Building interactive dashboards.
- Exploratory data analysis with interactive plots.
- Web-based data visualization applications.
- Tensor Flow and PyTorch
- Powerful libraries for deep learning and AI.
- Used to build neural networks and solve complex tasks like image recognition and NLP.
- Powerful frameworks for deep learning and neural networks.
- Support for GPU acceleration.
- Extensive community support and pre-trained models.
- Image and speech recognition.
- Natural language processing.
- Training and deploying deep learning models for various tasks.
Python Language Basics
Keywords:
- Keywords are the reserved words in the Python programming language.
- All keywords are designated with a special meaning.
- The meaning of all keywords is fixed, and it cannot be modified or removed.
- All the keywords need to be used as they have been defined (Lower case or Upper case).
- In Python, there are 35 keywords.
- All the keywords in Python are listed in the following table with their meaning:
S.No Keyword Description 1 FalseBoolean false value 2 NoneRepresents absence of value 3 TrueBoolean true value 4 andLogical AND operator 5 asAlias (e.g. in import, with statements) 6 assertAssert condition for debugging 7 asyncDefines an async function 8 awaitAwaits result of an async operation 9 breakExit loop early 10 classDefines a class structure 11 continueSkip to next loop iteration 12 defDefine a function 13 delDelete a variable or element 14 elifElse-if in conditional chains 15 elseFallback in conditionals or try blocks 16 exceptCatch exceptions 17 finallyAlways-run clause in try-except blocks 18 forFor-loop construct 19 fromImport specific items from modules 20 globalDeclare a global variable 21 ifConditional branching 22 importImport a module 23 inMembership test 24 isIdentity comparison 25 lambdaDefine anonymous (single-expression) function 26 nonlocalDeclare non-local (enclosing scope) variable 27 notLogical NOT operator 28 orLogical OR operator 29 passNo-op placeholder 30 raiseRaise an exception 31 returnReturn from a function 32 tryStart a try-except block 33 whileWhile-loop construct 34 withContext manager usage 35 yieldGenerate values from a generator
You can use the built-in keyword module in Python to display all the current keywords.
import keywordfor kw in keyword.kwlist: print(kw)
- Keywords are the reserved words in the Python programming language.
- All keywords are designated with a special meaning.
- The meaning of all keywords is fixed, and it cannot be modified or removed.
- All the keywords need to be used as they have been defined (Lower case or Upper case).
- In Python, there are 35 keywords.
- All the keywords in Python are listed in the following table with their meaning:
| S.No | Keyword | Description |
|---|---|---|
| 1 | False | Boolean false value |
| 2 | None | Represents absence of value |
| 3 | True | Boolean true value |
| 4 | and | Logical AND operator |
| 5 | as | Alias (e.g. in import, with statements) |
| 6 | assert | Assert condition for debugging |
| 7 | async | Defines an async function |
| 8 | await | Awaits result of an async operation |
| 9 | break | Exit loop early |
| 10 | class | Defines a class structure |
| 11 | continue | Skip to next loop iteration |
| 12 | def | Define a function |
| 13 | del | Delete a variable or element |
| 14 | elif | Else-if in conditional chains |
| 15 | else | Fallback in conditionals or try blocks |
| 16 | except | Catch exceptions |
| 17 | finally | Always-run clause in try-except blocks |
| 18 | for | For-loop construct |
| 19 | from | Import specific items from modules |
| 20 | global | Declare a global variable |
| 21 | if | Conditional branching |
| 22 | import | Import a module |
| 23 | in | Membership test |
| 24 | is | Identity comparison |
| 25 | lambda | Define anonymous (single-expression) function |
| 26 | nonlocal | Declare non-local (enclosing scope) variable |
| 27 | not | Logical NOT operator |
| 28 | or | Logical OR operator |
| 29 | pass | No-op placeholder |
| 30 | raise | Raise an exception |
| 31 | return | Return from a function |
| 32 | try | Start a try-except block |
| 33 | while | While-loop construct |
| 34 | with | Context manager usage |
| 35 | yield | Generate values from a generator |
keyword module in Python to display all the current keywords.Identifiers
- Identifiers in Python are names used to identify a variable, function, class, module, or other objects.
- An identifier can only contain letters, digits, and underscores, and cannot start with a digit.
- In Python, identifiers are case-sensitive, meaning that swathi and SWATHI are considered to be two different identifiers.
Rules for Identifiers in Python:
These are the rules for identifiers in Python:- Keywords cannot be used as identifiers in Python (because they are reserved words).
- The names of identifiers in Python cannot begin with a number.
- All the identifiers in Python should have a unique name in the same scope.
- The first character of identifiers in Python should always start with an alphabet or underscore, and then it can be followed by any of the digits, characters, or underscores.
- Identifier name length is unrestricted.
- Names of identifiers in Python are case sensitive, meaning ‘car’ and ‘Car’.
- Special characters such as ‘%’, ‘#’,’@’, and ‘$’ are not allowed as identifiers in python.
Valid Identifiers:
These are examples of valid identifiers in Python.- yourname It contains only lowercase letters.
- Name_school It contains only ‘_’ as a special character.
- Id1 Here, the numeric digit comes at the end.
- roll_2 It starts with a lowercase letter and ends with a digit.
- _classname contains lowercase alphabets and an underscore, and it starts with an underscore ‘_’.
- Identifiers in Python are names used to identify a variable, function, class, module, or other objects.
- An identifier can only contain letters, digits, and underscores, and cannot start with a digit.
- In Python, identifiers are case-sensitive, meaning that swathi and SWATHI are considered to be two different identifiers.
- Keywords cannot be used as identifiers in Python (because they are reserved words).
- The names of identifiers in Python cannot begin with a number.
- All the identifiers in Python should have a unique name in the same scope.
- The first character of identifiers in Python should always start with an alphabet or underscore, and then it can be followed by any of the digits, characters, or underscores.
- Identifier name length is unrestricted.
- Names of identifiers in Python are case sensitive, meaning ‘car’ and ‘Car’.
- Special characters such as ‘%’, ‘#’,’@’, and ‘$’ are not allowed as identifiers in python.
- yourname It contains only lowercase letters.
- Name_school It contains only ‘_’ as a special character.
- Id1 Here, the numeric digit comes at the end.
- roll_2 It starts with a lowercase letter and ends with a digit.
- _classname contains lowercase alphabets and an underscore, and it starts with an underscore ‘_’.
Invalid Identifiers:
These are examples of valid invalid identifiers in Python. - (for, while, in) - These are the keywords in Python that cannot be used as identifiers in Python.
- 1myname - Invalid identifier because it begins with a digit.
- \$myname - Invalid identifier because it starts with a special character
- a b - Invalid identifier because it contains a blank space.
- (a/b and a+b) - Invalid identifiers because they contain special characters.
- (for, while, in) - These are the keywords in Python that cannot be used as identifiers in Python.
- 1myname - Invalid identifier because it begins with a digit.
- \$myname - Invalid identifier because it starts with a special character
- a b - Invalid identifier because it contains a blank space.
- (a/b and a+b) - Invalid identifiers because they contain special characters.
Variable:
- In Python, a variable is a named memory where a programmer can store data and retrieve it for future use using the same name.
- In Python, variables are created without specifying any data type.
- There is no specific keyword used to create a variable.
- Variables are created directly by specifying the variable name with a value.
We use the following syntax to create a variable:Syntax: variable_name = value When a variable is defined, we must create it with a value.
roll_number = 101print(f'Student roll number is {roll_number}') Output:Student roll number is 101
Declaring multiple variables in a single statement- In Python, it is possible to define more than one variable using a single statement.
- When multiple variables are created using a single statement, the variables and their corresponding value must be separated with a comma.
Python code to illustrate variable declaration :name, roll_number = ('Saranya', 101) print(f'{name}'s roll number is {roll_number}') Output: Saranya's roll number is 101
Assigning a single value to multiple variables:x=y=z=50 print(x)print(y) print(z) Output: 50 50 50
- In Python, a variable is a named memory where a programmer can store data and retrieve it for future use using the same name.
- In Python, variables are created without specifying any data type.
- There is no specific keyword used to create a variable.
- Variables are created directly by specifying the variable name with a value.
- In Python, it is possible to define more than one variable using a single statement.
- When multiple variables are created using a single statement, the variables and their corresponding value must be separated with a comma.
Displaying the data type of a variable
- In Python, the data type of a variable never fixed to a particular data type and it keeps changing according to the value assigned to it.
- A variable in Python stores value of any data type.
- It can change its data type dynamically.
- The Python programming language provides a built-in function type( ) to display the data type of a variable.
Let's consider the following Python code:a = 105print(type(a)) a = 10.66 print(type(a)) a = 'Ashaz' print(type(a))Output: <class 'int'><class 'float'><class 'str'>
- In Python, the data type of a variable never fixed to a particular data type and it keeps changing according to the value assigned to it.
- A variable in Python stores value of any data type.
- It can change its data type dynamically.
- The Python programming language provides a built-in function type( ) to display the data type of a variable.
Comments
- Comments are essential for defining the code and help us and other to understand the code.
- By looking the comment, we can easily understand the intention of every line that we have written in code.
- We can also find the error very easily, fix them, and use in other applications.
- In Python, we can apply comments using the # hash character.
- The Python interpreter entirely ignores the lines followed by a hash character.
- A good programmer always uses the comments to make code under stable.
- Comments are essential for defining the code and help us and other to understand the code.
- By looking the comment, we can easily understand the intention of every line that we have written in code.
- We can also find the error very easily, fix them, and use in other applications.
- In Python, we can apply comments using the # hash character.
- The Python interpreter entirely ignores the lines followed by a hash character.
- A good programmer always uses the comments to make code under stable.
Types of Comments in Python
1. Single-Line Comments- Begin with the # symbol.
- Used for brief explanations or notes.
- Applies only to one line.
Syntax:# This is a single-line commentprint("Sri")
Output:Sri
2. Multi-Line (Block) Comments- Multiple single-line comments used together to create block comments.
- Python doesn't support block comments like /* */, so # is used at the beginning of each line.
Syntax:
# This type of comment can serve# both as a single line as well# as a multi-line (block) comment
Example:
# Read name from keyboard# variable name is myNamemyName = input("Enter your Name: ")# Display data on the output screenprint("Hello,", myName)
Output:
Enter your Name: SwathiHello, Swathi
3. Inline Style Comments- Placed on the same line as a statement.
- Used to explain the statement.
Example:
# Find the product of two numbersx = 5 # value 3 stored in xy = 7 # value 7 stored in yz = x * y # product stored in zprint("Product is:", z)
Output:
Product is: 35
4. Docstring Comments (Documentation Strings)- Written using triple quotes ''' or """.
- Used to describe the purpose of a function, class, or module.
- Not exactly comments, but act like them when not assigned to a variable.
Example:
"""Read two values through command line argumentsThen find the sum with data type conversion"""import sysx = int(sys.argv[1]) # Read and convert xy = int(sys.argv[2]) # Read and convert ysum = x + yprint("Sum of two numbers is:", sum)
Command Line Output:
> python docstring.py 15 6Sum of two numbers is: 21
- Begin with the # symbol.
- Used for brief explanations or notes.
- Applies only to one line.
- Multiple single-line comments used together to create block comments.
- Python doesn't support block comments like /* */, so # is used at the beginning of each line.
- Placed on the same line as a statement.
- Used to explain the statement.
- Written using triple quotes ''' or """.
- Used to describe the purpose of a function, class, or module.
- Not exactly comments, but act like them when not assigned to a variable.
Datatypes
In Python, everything is an object, and every object has a data type.A data type defines the type of value a variable can hold and the operations that can be performed on it.
Classification of Python Built-in Data Types:
➤ 1. Numeric Data Types
These represent numbers and are of three types:
a) int- Used to store whole numbers.
- Can be positive or negative.
- There is no limit to the size of the integer.
Example:
a = 291print(type(a)) # <class 'int'>
b) float- Used to store decimal (floating point) numbers.
- It can also store values in scientific notation using e or E.
Example:
b = 267.0print(type(b)) # <class 'float'>
c) complex- Used to represent complex numbers.
- Syntax: real + imaginary j
Example:
c = 231 + 27jprint(type(c)) # <class 'complex'>
- Used to store whole numbers.
- Can be positive or negative.
- There is no limit to the size of the integer.
- Used to store decimal (floating point) numbers.
- It can also store values in scientific notation using e or E.
- Used to represent complex numbers.
- Syntax: real + imaginary j
➤ 2. Sequence Data Types
Sequences store a collection of items in a specific order.
a) str (String)- A string is a sequence of Unicode characters.
- Defined using single, double, or triple quotes.
- No separate character data type in Python. A character is a string of length 1.
Example:
s = "Swathi"print(type(s)) # <class 'str'>
b) list- Ordered, mutable (can change after creation).
- Allows duplicate values.
- Can hold elements of different types.
Creating a list:
lst = ["Hello", "Swathi", 71, 2025, 6.0, 'K']
Accessing list elements:
print(lst[0]) # 'Hello'print(lst[-1]) # 'K' (last item)print(type(lst[2])) # <class 'int'>
c) tuple- Ordered, immutable (cannot change once created).
- Can contain elements of different data types.
Creating a tuple:
tpl = ('Hello', 'Swathi')tpl2 = tuple([60.5, 27, 2025, 11, "Earth", "K"])
Nested tuple:
tpl3 = (tpl2, tpl)
Accessing elements:
print(tpl3[0][2]) # 2025
- A string is a sequence of Unicode characters.
- Defined using single, double, or triple quotes.
- No separate character data type in Python. A character is a string of length 1.
- Ordered, mutable (can change after creation).
- Allows duplicate values.
- Can hold elements of different types.
- Ordered, immutable (cannot change once created).
- Can contain elements of different data types.
➤ 3. Boolean Type
- Has only two values: True and False
- Used in logical operations, conditions, and comparisons.
- Internally, True is treated as 1 and False as 0.
Example:
print(type(True)) # <class 'bool'>print(type(False)) # <class 'bool'>
- Has only two values: True and False
- Used in logical operations, conditions, and comparisons.
- Internally, True is treated as 1 and False as 0.
➤ 4. Set
- Unordered collection of unique items.
- Cannot have duplicates.
- Mutable – we can add or remove elements.
- Does not support indexing.
Creating a set:
set1 = set()set2 = {68.0, 27, 2025, 'K', 'Swathi', 'S'}
Accessing elements:
for i in set2: print(i) # Iterates through set elements
- Unordered collection of unique items.
- Cannot have duplicates.
- Mutable – we can add or remove elements.
- Does not support indexing.
➤ 5. Dictionary
- Stores key-value pairs.
- Unordered, mutable, and does not allow duplicate keys.
- Keys must be immutable (like numbers, strings, or tuples).
Creating a dictionary:
dic = {"name": "Chinnu", "age": 2, 5: "Swathi"}
Accessing values:
print(dic['name']) # Chinnuprint(dic[5]) # Swathiprint(dic.get('age')) # 2
- Stores key-value pairs.
- Unordered, mutable, and does not allow duplicate keys.
- Keys must be immutable (like numbers, strings, or tuples).
Operators
- In Python, an operator is a symbol used to perform arithmetical and logical operations.
- In other words, an operator can be defined as a symbol used to manipulate the value of an operand.
- Here, an operand is a value or variable on which the operator performs its task.
- For example, '+' is a symbol used to perform the mathematical addition operation. Consider the expression a = 10 + 30.
- Here, variable 'a', values '10' and '30' are known as Operands, and the symbols '=' and '+' are known as Operators.
Types of Operators in Python
- Arithmetic Operators ( +, -, *, /, %, **, // )
- Assignment Operators ( =, +=, -=, *=, /=, %=, **=, //= )
- Comparison Operators ( <, <=, >, >=, ==, != )
- Logical Operators ( and, or, not )
- Identity Operators ( is, is not )
- Membership Operators ( in, not in )
- In Python, the arithmetic operators are the operators used to perform a basic arithmetic operation between two variables or two values.
- The following table presents the list of arithmetic operations in Python along with their description.
- To understand the example, let's consider two variables, a with value 10 and b with value 3.
| Operator | Meaning | Description | Example |
|---|---|---|---|
+ | Addition | Adds the values on both sides of the operator | a + b = 13 |
- | Subtraction | Subtracts the right-hand operand from the left-hand operand | a - b = 7 |
* | Multiplication | Multiply values on both sides of the operator | a * b = 30 |
/ | Division | Divides the left-hand operand by the right-hand operand | a / b = 3.33 |
% | Modulus | Returns the remainder of the division of the left operand by the right operand | a % b = 1 |
** | Exponentiation | Raises the left operand to the power of the right operand | a ** b = 100000 |
// | Floor Division | Divides and returns the largest whole number less than or equal to the result | a // b = 3 |
- In Python, the assignment operators are the operators used to assign the right-hand side value to the left-hand side variable.
- The following table presents the list of assignment operations in Python along with their description.
| Operator | Meaning | Description | Example |
|---|---|---|---|
= | Assignment | Assigns the value on the right to the variable on the left | x = 5 |
+= | Add and Assign | Adds the right operand to the left operand and assigns the result to the left operand | x += 3 → x = x + 3 |
-= | Subtract and Assign | Subtracts the right operand from the left operand and assigns the result to the left operand | x -= 2 → x = x - 2 |
*= | Multiply and Assign | Multiplies the left operand by the right and assigns the result to the left operand | x *= 4 → x = x * 4 |
/= | Divide and Assign | Divides the left operand by the right and assigns the result to the left operand | x /= 2 → x = x / 2 |
%= | Modulus and Assign | Takes modulus using left and right operands, assigns the result to the left operand | x %= 3 → x = x % 3 |
**= | Exponent and Assign | Raises the left operand to the power of the right operand and assigns it to the left operand | x **= 2 → x = x ** 2 |
//= | Floor Divide and Assign | Performs floor division and assigns the result to the left operand | x //= 3 → x = x // 3 |
- In Python, the comparison operators are used to compare two values.
- In other words, comparison operators are used to check the relationship between two variables or values.
- The comparison operators are also known as Relational Operators.
| Operator | Meaning | Description | Example | Result |
|---|---|---|---|---|
< | Less than | Returns True if the left value is smaller than the right value, else False | a < b | False |
<= | Less than or Equal to | Returns True if the left value is smaller than or equal to the right value, else False | a <= b | False |
> | Greater than | Returns True if the left value is larger than the right value, else False | a > b | True |
>= | Greater than or Equal to | Returns True if the left value is larger than or equal to the right value, else False | a >= b | True |
== | Equal to | Returns True if the left value is equal to the right value, else False | a == b | False |
!= | Not equal to | Returns True if the left value is not equal to the right value, else False | a != b | True |
- In Python, the logical operators are used to merge multiple conditions into a single condition.
- In Python, the logical operators are used to merge multiple conditions into a single condition.
| Operator | Meaning | Description |
|---|---|---|
and | Logical AND | Returns True if both conditions are True. Otherwise, False. |
or | Logical OR | Returns True if at least one condition is True. |
not | Logical NOT | Reverses the logical state of the condition. |
- In Python, identity operators are used to comparing the memory locations of two objects or variables.
- The following table presents the list of identity operations in Python along with their description.
| Operator | Meaning | Description |
|---|---|---|
is | Is identical | Returns True if both variables point to the same object in memory. |
is not | Is not identical | Returns True if both variables do not point to the same object in memory. |
- In Python, the membership operators are used to test whether a value is present in a sequence.Here the sequence may be String, List, or Tuple.
- The following table presents the list of membership operations in Python along with their description.
| Operator | Meaning | Description |
|---|---|---|
in | In | Returns True if the value is found in the given sequence. |
not in | Not in | Returns True if the value is not found in the sequence. |
Input and Output Functions
input() function in Python is used to take input from the user during program execution. It always returns the input as a string.print() function in Python is used to display output to the console. It can print strings, numbers, variables, and even formatted text.sep: defines the separator between printed items (default is space ' ').end: defines what to print at the end (default is a newline \n).f or F. It allows you to embed expressions or variables directly inside the string using curly braces {}.format() function is a string method that inserts values into placeholders {} within a string.Type Conversion
int to float, or from str to int.- The Python interpreter automatically converts one data type to another without any user involvement.
It happens when you mix different types in an expression.
Python converts smaller types to larger types to avoid data loss.
It is done manually by the programmer as per requirement.
You use built-in functions to convert between types.
| Function | Converts to | Example |
|---|---|---|
int(x) | Integer | int("5") → 5 |
float(x) | Floating-point | float("5") → 5.0 |
str(x) | String | str(5) → "5" |
bool(x) | Boolean | bool(0) → False |
list(x) | List | list("abc") → ['a','b','c'] |
tuple(x) | Tuple | tuple([1,2]) → (1, 2) |
set(x) | Set | set([1,2,2]) → {1,2} |
Flow of Control
By default, Python executes code line by line from top to bottom. However, we can change this flow using:
Conditional Statements
Looping Statements
Loop Control Statements
| Statement | Description |
|---|---|
if | Executes a block if the condition is true |
elif | Else if — checks another condition |
else | Executes if all above are false |
if statements
if statement executes a block of code only if the condition is true.elif statements
elif stands for "else if". It checks another condition if the previous if was False. else statements
else block executes if none of the previous conditions are true.Loops are used to execute a block of code repeatedly as long as a specified condition is true. Instead of writing the same code multiple times, you use loops to automate repetition.
for loop
A for loop in Python is used to iterate over a sequence such as a list, tuple, string, or a range of numbers.
It repeats a block of code for each item in the sequence.
Flowchart:
Syntax:
# Code block (executed for each item)
variable: A temporary name that holds the current item from the sequence.sequence: A collection (like a list or range()) that the loop will go through.while loop
while loop repeats a block of code as long as a given condition is true.for loop (which iterates over a sequence), the while loop is used when the number of iterations is not known in advance.The
conditionis evaluated before each iteration.The loop runs as long as the condition is True.
Make sure to change a variable inside the loop to avoid an infinite loop!
Jumping statements are used to control the flow of loops — they allow you to:
Exit a loop early
Skip an iteration
Do nothing but satisfy syntax
| Statement | Description |
|---|---|
break | Exits the loop immediately |
continue | Skips current iteration, moves to next |
pass | Does nothing, used as a placeholder |
break statement
for and while) immediately when a condition is met.


.png)
.png)
👍🏻
ReplyDelete