Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • sgraupner/ds_cs4bd_2324
  • akbi5459/ds_cs4bd_2324
  • s90907/ds_cs4bd_2324
  • fapr2511/ds_cs4bd_2324
4 results
Show changes
Commits on Source (47)
Showing
with 2803 additions and 181 deletions
......@@ -131,3 +131,7 @@ dmypy.json
# project-specific files
README_init.md
# solution files and .py files in project directory
**/*_sol.py
/*.py
......@@ -3,10 +3,9 @@
"python.testing.unittestArgs": [
"-v",
"-s",
"./C_expressions_py",
"./D_functions_and_streams",
".",
"-p",
"*_test.py"
"test_*.py"
],
"python.testing.pytestEnabled": false,
"python.testing.unittestEnabled": true
......
# Assignment A: Setup Python &nbsp; (<span style="color:red">10 Pts</span>)
# Assignment A: Setup Python &nbsp; (10 Pts)
This assignment will setup your base Python enviroment. If you already have it, simply run challenges and answer questions (if any). If you cannot run challenges, set up the needed software.
### Challenges
1. [Challenge 1:](#1-challenge-1-terminal) Terminal
2. [Challenge 2:](#2-challenge-2-python3) Python3
3. [Challenge 3:](#3-challenge-3-pip) pip
4. [Challenge 4:](#4-challenge-4-test-python) Test Python
- [Challenge 1:](#1-challenge-terminal) Terminal
- [Challenge 2:](#2-challenge-python3) Python3
- [Challenge 3:](#3-challenge-pip) pip
- [Challenge 4:](#4-challenge-test-python) Test Python
Points: [4, 2, 2, 2]
&nbsp;
### 1.) Challenge 1: Terminal
### 1.) Challenge: Terminal
If you are using *MacOS* or *Linux*, skip steps for *Windows*.
For *Windows*,
For *Windows*:
- Use a Unix emulator such as
[cygwin](https://www.cygwin.com),
......@@ -24,44 +26,60 @@ the built-in Windows Subsystem for Linux
or a
[Linux VM](https://ubuntu.com/tutorials/how-to-run-ubuntu-desktop-on-a-virtual-machine-using-virtualbox).
- Windows *CMD.EXE* is no option. *Powershell* is not recommended since it is not compatible with Unix standards.
- Windows *CMD.EXE* is no option and *Powershell* is not recommended since it is not compatible with Unix standards.
- Follow
[instructions](../Cygwin_setup.md)
for setting up *cygwin*.
- Follow instructions for
[setting up *cygwin*](../Cygwin_setup.md),
in particular for:
- switching from `/cygdrive/c` to `/c`
- selecting your `HOME`-directory
- configuring `.bashrc` in your `HOME`-directory and
- defining `PATH` in `.bashrc`.
Open a terminal and type commands:
```sh
> ls -la
> pwd
> whoami
> cat ~/.profile
> cat ~/.bashrc
> echo $PATH
> cd # change to HOME-directory
> pwd # print path to working directory
> ls -la # show content of HOME-directory
> whoami # show user identity
> cat .profile # file may not exist
> cat .bashrc # for Mac, use .zshrc
> echo $PATH # show PATH variable
```
Explain commands. If you are not familiar, find out about these basic Unix shell (bash, zsh, ...) commands (e.g. from [introduction](https://cs.lmu.edu/~ray/notes/bash) or
[tutorial](https://linuxconfig.org/bash-scripting-tutorial-for-beginners)).
On *Mac*, refer to file *.zshrc* instead of *.bashrc*.
On *Mac*, use file *.zshrc* instead of *.bashrc*. If file does not exist
in the `HOME`-directory, it can be created as a text file using an editor.
Find out about
[ differences between](https://www.baeldung.com/linux/bashrc-vs-bash-profile-vs-profile)
`.profile` and `.bashrc` (Mac: `.zshrc`):
- When is `.profile` executed?
- When is `.bashrc` (Mac: `.zshrc`) executed?
(4 Pts)
&nbsp;
### 2.) Challenge 2: Python3
### 2.) Challenge: Python3
Check if you have Python 3 installed on your system. Name three differences between [Python 2 and 3](https://www.guru99.com/python-2-vs-python-3.html#7).
Run commands in terminal (exact version 3.x.x may vary):
Run commands in terminal (version 3.x.y may vary):
```sh
> python --version
Python 3.12.0
```
If you see error *command not found*, add path to Python installation to
PATH variable in `.bashrc` (Mac: `.zshrc`).
(2 Pts)
&nbsp;
### 3.) Challenge 3: pip
Check if you have a Python package manager installed (pip, conda, ... ). [`pip`](https://pip.pypa.io) is Python's default package manager needed to install additional python packages and libraries.
### 3.) Challenge: pip
Check if you have a Python package manager installed (pip, conda, ... ). [`pip`](https://pip.pypa.io) is Python's default package manager to install additional Python packages and libraries.
Follow [instructions](https://pip.pypa.io/en/stable/installing) for installation:
- [download](https://bootstrap.pypa.io/get-pip.py) the `get-pip.py` file.
......@@ -69,7 +87,7 @@ Follow [instructions](https://pip.pypa.io/en/stable/installing) for installation
- or update pip to latest version: `python -m pip install --upgrade pip`
Run commands in terminal:
```sh
```
> pip --version
pip 23.2.1 from C:\Users\svgr2\AppData\Local\Programs\Python\Python312\Lib\site-
packages\pip (python 3.12)
......@@ -78,8 +96,8 @@ packages\pip (python 3.12)
&nbsp;
### 4.) Challenge 4: Test Python
Test Python:
### 4.) Challenge: Test Python
Start Python in the terminal and execute commands:
```py
> python
Python 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] on win32
......@@ -115,7 +133,7 @@ print('Python system: ' + sys)
print('Python version: ' + platform.python_version())
```
Run the file. Output varies depending on your system.
Run file *print_sys.py*. Output varies depending on your system.
```
> python print_sys.py
Python impl: CPython
......
# Assignment B: Explore Python &nbsp; (<span style="color:red">XX Pts</span>)
# Assignment B: Explore Python &nbsp; (20 Pts)
This assignment demonstrates Python's basic data structures.
### Challenges
1. [Challenge 1:](#1-challenge-1-indexing-fruits) Indexing Fruits
2. [Challenge 2:](#2-challenge-2-packaging-fruits) Packaging Fruits
3. [Challenge 3:](#3-challenge-3-sorting-fruits) Sorting Fruits
4. [Challenge 4:](#4-challenge-4-income-analysis) Income Analysis
5. [Challenge 5:](#5-challenge-5-code-income-analysis) Code Income Analysis
6. [Challenge 6:](#6-challenge-6-python-built-in-functions)
Python built-in functions
- [Challenge 1:](#1-challenge-indexing-fruits) Indexing Fruits
- [Challenge 2:](#2-challenge-packaging-fruits) Packaging Fruits
- [Challenge 3:](#3-challenge-sorting-fruits) Sorting Fruits
- [Challenge 4:](#4-challenge-income-analysis) Income Analysis
- [Challenge 5:](#5-challenge-code-income-analysis) Code Income Analysis
- [Challenge 6:](#6-challenge-explore-python-built-in-functions)
Explore Python built-in functions
Points: [1, 7, 2, 3, 6, 1]
&nbsp;
### 1.) Challenge 1: Indexing Fruits
Explore Python. Review Python's basic
### 1.) Challenge: Indexing Fruits
Review Python's basic
[data structures](https://www.dataquest.io/blog/data-structures-in-python).
Answer questions on a piece of paper.
```py
# Python is known for advanced list processing.
>>> fruits = ['apple', 'pear', 'orange', 'banana']
>>> fruits = ['apple', 'pear', 'orange', 'banana', 'apple']
>>> print(fruits)
>>> fruits
['apple', 'pear', 'orange', 'banana']
['apple', 'pear', 'orange', 'banana', 'apple']
>>> len(fruits)
4
>>> print(len(fruits))
5
>>> print(f"the third fruit is: {fruits[2]}")
the third fruit is: orange
......@@ -35,36 +36,45 @@ the third fruit is: orange
the second and third fruits are: ['pear', 'orange']
>>> print(f"the last fruit is: {fruits[-1]}")
the last fruit is: banana
the last fruit is: apple
>>> print(f"the second-last fruit is: {fruits[-2]}")
the second-last fruit is: banana
>>> print(f"the last two fruits are: {fruits[-2:]}")
the last two fruits are: ['orange', 'banana']
>>> print(f"the last three fruits are: {fruits[-3:]}")
the last three fruits are: ['orange', 'banana', 'apple']
```
Perform examples on your laptop.
(1 Pt)
&nbsp;
### 2.) Challenge 2: Packaging Fruits
### 2.) Challenge: Packaging Fruits
Review Python's built-in
[data structures](https://www.dataquest.io/blog/data-structures-in-python)
and answer questions on a piece of paper.
[data structures](https://www.dataquest.io/blog/data-structures-in-python).
Perform examples and answer questions on a piece of paper.
1. What are data types for `fruits`, `fruitbag` and `fruitbox` called? (1 Pt)
1. Name three properties that characterize each data type. (1 Pts)
1. What are the differences between `fruits`, `fruitbag` and
`fruitbox`?
1. Why does output for `fruitbag` differ from input? (1 Pt)
```py
>>> fruits = ['apple', 'pear', 'orange', 'banana']
>>> fruitbag = {'apple', 'pear', 'orange', 'banana'}
>>> fruitbox = ('apple', 'pear', 'orange', 'banana')
>>> fruits = ['apple', 'pear', 'orange', 'banana', 'apple']
>>> fruitbag = {'apple', 'pear', 'orange', 'banana', 'apple'}
>>> fruitbox = ('apple', 'pear', 'orange', 'banana', 'apple')
>>> print(fruits)
['apple', 'pear', 'orange', 'banana']
['apple', 'pear', 'orange', 'banana', 'apple']
>>> print(fruitbox)
('apple', 'pear', 'orange', 'banana')
('apple', 'pear', 'orange', 'banana', 'apple')
>>> print(fruitbag)
{'orange', 'banana', 'apple', 'pear'}
{'banana', 'apple', 'pear', 'orange'}
>>> print(fruits[1])
pear
......@@ -75,131 +85,195 @@ and answer questions on a piece of paper.
>>>
```
1. How is the structure for Eric called?
1. How is the structure for `eric1` called? What is the difference to
`eric2`? Explain outputs. (1 Pt)
```py
eric = {"name": "Eric", "salary": 5000, "birthday": "Sep 25 2001"}
eric1 = {"name": "Eric", "salary": 5000, "birthday": "Sep 25 2001"}
>>> print(eric)
eric2 = {"name", "Eric", "salary", 5000, "birthday", "Sep 25 2001"}
>>> print(eric1)
{'name': 'Eric', 'salary': 5000, 'birthday': 'Sep 25 2001'}
>>> print(eric["salary"])
>>> print(eric2)
{'Sep 25 2001', 5000, 'name', 'Eric', 'birthday', 'salary'}
#
# print(eric2) in same order?
# print salary for eric1 and eric2?
>>> print(eric1["salary"])
5000
```
1. Some people say that Arrays in other languages are Lists
in Python. Other people argue that Tuples are closer to Arrays.
- a) Which statement is (more) correct?
- b) Name two differences between Arrays and Lists?
(1 Pt)
- c) What are differences between a Python 2-dimensional List
(List with Lists) and a *NumPy* Array *ndarray* (read the first
[three paragraphs](https://numpy.org/doc/stable/user/whatisnumpy.html))?
(1 Pt)
1. Draw sketches to visualize Python data structures:
*List*, *Set*, *Tuple*, *Dictionary* and *NumPy Array*. (1 Pt)
&nbsp;
### 3.) Challenge 3: Sorting Fruits
### 3.) Challenge: Sorting Fruits
1. What is the difference between *sort()* and built-in function *sorted()*,
[link](https://www.python-engineer.com/posts/sort-vs-sorted) (2 Pts)?
```py
>>> fruits = ['apple', 'pear', 'orange', 'banana', 'apple']
```py
>>> fruits = ['apple', 'pear', 'orange', 'banana']
>>> f1 = sorted(fruits)
>>> print(f"{f1},\n{fruits}")
['apple', 'apple', 'banana', 'orange', 'pear'],
['apple', 'pear', 'orange', 'banana', 'apple']
>>> f1 = sorted(fruits)
>>> print(f"{f1},\n{fruits}")
['apple', 'banana', 'orange', 'pear'],
['apple', 'pear', 'orange', 'banana']
>>> f2 = fruits.sort()
>>> print(f"{f2},\n{fruits}")
None,
['apple', 'banana', 'orange', 'pear']
```
>>> f2 = fruits.sort()
>>> print(f"{f2},\n{fruits}")
None,
['apple', 'apple', 'banana', 'orange', 'pear']
```
1. Some people say that Arrays in other languages are
Lists in Python. Other people argue that Tuples are Arrays.
- a) Which statement is (more) correct? (1 Pt)
- c) Why? (1 Pt)
- b) Name three differences between Arrays and Lists?
(3 Pt)
What is the difference between List-function *sort()* and built-in
function *sorted()*, see
[link](https://www.python-engineer.com/posts/sort-vs-sorted)?
1. Draw sketches to visualize Python data structures:
*List*, *Set*, *Tuple*, *Dictionary* and *Array* (from other
languages like C, C++). (1 Pt)
(2 Pts)
&nbsp;
### 4.) Challenge 4: Income Analysis
### 4.) Challenge: Income Analysis
The US tax Income Revenue Service (IRS) annually
publishes income statistics by ZIP codes
publishes income statistics by ZIP codes (postal codes)
([reports](https://www.irs.gov/statistics/soi-tax-stats-individual-income-tax-statistics-2020-zip-code-data-soi)).
For example, California ZIP Code
[93636](https://simplemaps.com/us-zips/93636)
is a rural agricultural county of Madera, north of
is for *Madera* county, an agricultural region north of
Fresno in the Central Valley.
Income distribution for the tax year 2020 was:
The income distribution in this county was for the tax year 2020:
```
income bracket: number of tax returns
filed in bracket
[$1 to under $25,000] 1,800
[$25,000 to under $50,000] 1,380
[$50,000 to under $75,000] 980
[$75,000 to under $100,000] 830
[$100,000 to under $200,000] 1,660
[$200,000 or more < $50M>] 550
[$1 to under $25,000] 1,800
[$25,000 to under $50,000] 1,380
[$50,000 to under $75,000] 980
[$75,000 to under $100,000] 830
[$100,000 to under $200,000] 1,660
[$200,000 or more, up to $10M] 550
```
Numbers mean that 980 tax returns were filed in the
bracket [$50,000 to under $75,000] taxable income.
In the bracket [$50,000 to under $75,000] of taxable income (exact to 74,999),
a total of 980 tax returns were filed, which means 980 tax payers reported
taxable income in this band.
We assume $10 million ($10M) as upper limit in the highest bracket.
A common statistical analysis is to compute:
- the *mean (average) income* per tax filer and the
- the *mean (average) income* and the
- the *median income* across all tax payers in *Madera* county.
- the *median income* per tax filer.
For calculating the *mean income*, average the income within
each bracket.
Assume $50 million as upper limit for *"more"* in the
highest bracket.
For calculating the *median income*, consider a linear rising
income from the lower bound to the upper bound in the bracket
holding the *median income*.
Answer questions:
1. What is the difference between *mean (average)* and
*median* calculations? (1 Pt)
*median* calculations?
- Why are both indicators relevant?
- When should one over the other be preferred?
1. Why are both indicators relevant? (1 Pt)
1. Calculate manually the *mean* or *average* income for *Madera* county
(result: *$453,073* ).
1. Calculate manually the *average* income for Madera
county.
1. Calculate manually the *median* income for *Madera* county
(result: *$60,714* ).
1. Calculate manually the *median* income for Madera
county.
(3 Pts)
&nbsp;
### 5.) Challenge 5: Code Income Analysis
### 5.) Challenge: Code Income Analysis
Write Python code to perform this income analysis.
Write Python code to perform this income analysis for arbitray
ZIP regions.
<b>Use pure Python</b> (no *Pandas* nor *Numpy*) for this simple example.
Use file
<b>[income_tax_analysis.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/B_explore_python/income_tax_analysis.py)</b>
as template.
<b>Use pure Python</b>, no libraries such as *Pandas* or *Numpy* or
built-in library functions.
Think about following steps:
1. Chose a suitable Python structure to represent tax data for a ZIP code.
1. Chose a suitable Python data structure to represent tax data for
a ZIP code.
- Which data is relevant for the analysis?
- How can data be structured?
- Use only use Python structures: *list*, *set*, *tuple*, *dictionary*.
- Use only use Python structures: *list*, *set*, *tuple*, *dictionary*.
(1 Pt)
1. Code data for one ZIP code into your structure (no need to read `.xlsx`-files).
1. Implement that data structure for the *Madera* data
(fill data in code, no need to read `.xlsx`-files). (1 Pt)
1. Define two functions `mean_income(...)` and `median_income(...)` that take
data for one ZIP code as input and return respective numbers.
1. Define two functions `mean_income(...)` and `median_income(...)`
that take a data structure for a ZIP code as input and return
respective numbers as results.
1. Define function `number_of_returns(...)`.
1. Define a function `number_of_returns(...)` that returns what the
name suggests.
1. Implement functions and demonstrate they return correct values.
1. Implement these functions and demonstrate they return correct values.
1. Demonstrate analysis for other ZIP codes:
1. Demonstrate analysis for other ZIP codes using IRS tax
[reports](https://www.irs.gov/statistics/soi-tax-stats-individual-income-tax-statistics-2020-zip-code-data-soi).
Select a state (CA: California, IA: Iowa, NY: New York City), download
the `.xlsx` and navigate to ZIP codes for:
- [94040](https://simplemaps.com/us-zips/94040) (Mountain View, CA),
- [94304](https://simplemaps.com/us-zips/94304) (Palo Alto, CA),
- [94027](https://simplemaps.com/us-zips/94027) (Atherton, CA),
- [50860](https://simplemaps.com/us-zips/93636) (Redding, IA) and
- [10023](https://simplemaps.com/us-zips/10023) (New York City, NY Upper West side).
- [50860](https://simplemaps.com/us-zips/50860) (Redding, IA) and
- [10023](https://simplemaps.com/us-zips/10023) (New York City, NY Upper West side). (1 Pt)
Develop (test, debug) code in
[income_tax_analysis.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/B_explore_python/income_tax_analysis.py)
in your IDE (*VS Code*, *PyCharm*, *Jupyter*, etc.).
Run the final result in a terminal (*Jupyter*: export Python file):
```sh
$ python income_tax_analysis.py
```
Results:
```
mean_income in Madera County, CA is: 453,073 - median_income is: 60,714
mean_income in Mountain View, CA is: 1,740,371 - median_income is: 114,820
mean_income in Palo Alto, CA is: 2,077,038 - median_income is: 153,658
mean_income in Atherton, CA is: 2,623,882 - median_income is: 354,088
mean_income in Redding, IA is: 33,333 - median_income is: 31,250
mean_income in New York City, NY U West is: 1,544,991 - median_income is: 104,775
```
(4 Pts)
&nbsp;
### 6.) Challenge 6: Python built-in functions
Learn about Python's [built-in functions](https://docs.python.org/3/library/functions.html). Test the [*globals()*](https://docs.python.org/3/library/functions.html#globals) function.
### 6.) Challenge: Explore Python built-in functions
Learn about Python's
[built-in functions](https://docs.python.org/3/library/functions.html).
Test the
[*globals()*](https://docs.python.org/3/library/functions.html#globals)
function.
```py
>>> globals()
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_
......@@ -214,4 +288,4 @@ Test the [*input()*](https://docs.python.org/3/library/functions.html#input) fun
"Monty Python's Flying Circus"
exit()
```
(2 Pts)
(1 Pt)
'''
Code for Assignment B: Explore Python
- Challenge 1: Indexing Fruits
- Challenge 2: Packaging Fruits
- Challenge 3: Sorting Fruits
'''
fruits = ['apple', 'pear', 'orange', 'banana', 'apple']
fruitbag = {'apple', 'pear', 'orange', 'banana', 'apple'}
fruitbox = ('apple', 'pear', 'orange', 'banana', 'apple')
def index_fruits():
# fruits = ['apple', 'pear', 'orange', 'banana']
print(fruits)
print(len(fruits))
print(f"the third fruit is: {fruits[2]}")
print(f"the second and third fruits are: {fruits[1:3]}")
print(f"the last fruit is: {fruits[-1]}")
print(f"the last two fruits are: {fruits[-2:]}")
def package_fruits():
print(f'fruits: {fruits}')
print(f'fruitbox: {fruitbox}')
print(f'fruitbag: {fruitbag}')
print(fruits[1])
print(fruitbox[1])
# print(fruitbag[1])
#
eric = {"name": "Eric", "salary": 5000, "birthday": "Sep 25 2001"}
print(eric)
print(eric["salary"])
def sort_fruits():
f1 = sorted(fruits)
print(f"{f1},\n{fruits}")
f2 = fruits.sort()
print(f"{f2},\n{fruits}")
if __name__ == '__main__':
'''
Main driver that runs when this file is executed by the Python interpreter.
'''
index_fruits()
package_fruits()
sort_fruits()
#
# print(globals())
'''
Code for Assignment B: Explore Python
- Challenge 4: Income Analysis
- Challenge 5: Code Income Analysis
Based on 2020 tax returns, income distribution in Madera County, CA
with (postal) ZIP code 93636 was:
#
income brackets: number of tax returns
filed in brackets:
[$1 to under $25,000] 1,800
[$25,000 to under $50,000] 1,380
[$50,000 to under $75,000] 980
[$75,000 to under $100,000] 830
[$100,000 to under $200,000] 1,660
[$200,000 or more, up to $10M>] 550
'''
# design a data structure that stores information about a ZIP area
# that is relevant for mean/median tax analysis
zip_93636 = None
zip_94040 = None
zip_94304 = None
zip_94027 = None
zip_50860 = None
zip_10023 = None
# implement a function that calculates the mean income for a ZIP area
def mean_income(_zip) -> int:
return 25000 # mock result, replace with computed result
# implement a function that calculates the median income for a ZIP area
def median_income(_zip) -> int:
return 18500 # mock result, replace with computed result
# use this function to print results for a ZIP area
def print_analysis(_zip):
_county = 'Madera, CA'
print(
f'mean_income in {_county:26} is: {mean_income(_zip):10,} - ' +
f'median_income is: {median_income(_zip):8,}'
)
# attempt to load solution module (if present - ignore)
try:
solution_module = 'income_tax_analysis_sol'
mod = __import__(solution_module, globals(), locals(), [], 0)
mean_income, median_income, print_analysis = mod.mean_income, mod.median_income, mod.print_analysis
zip_93636, zip_94040, zip_94304, zip_94027, zip_50860, zip_10023 = \
mod.zip_93636, mod.zip_94040, mod.zip_94304, mod.zip_94027, mod.zip_50860, mod.zip_10023
#
except ImportError:
pass
if __name__ == '__main__':
'''
driver code that runs when file is directly executed
'''
print_analysis(zip_93636)
print_analysis(zip_94040)
print_analysis(zip_94304)
print_analysis(zip_94027)
print_analysis(zip_50860)
print_analysis(zip_10023)
{
"python.testing.unittestArgs": [
"-v",
"-s",
".",
"-p",
"test_*.py"
],
"python.testing.pytestEnabled": false,
"python.testing.unittestEnabled": true
}
\ No newline at end of file
# Assignment C: Python Expressions & Unit Tests &nbsp; (16 Pts)
This assignment demonstrates Python's powerful (*"one-liner"*) expressions.
### Challenges
- [Challenge 1:](#1-challenge-create-new-project) Create New Project
- [Challenge 2:](#2-challenge-run-code) Run Code
- [Challenge 3:](#3-challenge-run-unit-tests) Run Unit Tests
- [Challenge 4:](#4-challenge-write-expressions) Write Expressions
- [Challenge 5:](#5-challenge-final-test-and-sign-off) Final Test and sign-off
Points: [1, 2, 3, 0, 10]
&nbsp;
### 1.) Challenge: Create New Project
Create a Python project, e.g. named `C_expressions`, and
[pull files](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/tree/main/C_expressions)
from GitLab (above).
Inspect files and figure out their purpose. Write 1-2 sentenses what each file means
and purpose is:
- [__init __.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/__init__.py)
: `_____________________________________`
- What does the init-file contain?
- When and how often is this file executed?
- [expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/expressions.py)
: `__________________________________`
- [test_expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/test_expressions.py)
: `______________________________`
(1 Pt)
&nbsp;
### 2.) Challenge: Run Code
Run file `expressions.py` in your IDE:
```
numbers: [4, 12, 3, 8, 17, 12, 1, 8, 7]
#
a) number of numbers: 9
b) first three numbers: []
c) last three numbers: []
d) last three numbers reverse: []
e) odd numbers: []
f) number of odd numbers: 0
g) sum of odd numbers: 0
h) duplicate numbers removed: []
i) number of duplicate numbers: 0
j) ascending, de-dup (n^2) numbers: []
k) length: NEITHER
[Done] exited with code=0 in 0.126 seconds
```
(1 Pt)
Run file `expressions.py` in terminal:
```sh
cd <project> # cd into project directory
pwd # print working directory
/c/.../workspaces/ds_cs4bd_2324/C_expressions
python expressions.py # run program
-->
numbers: [4, 12, 3, 8, 17, 12, 1, 8, 7]
#
a) number of numbers: 9
b) first three numbers: [4, 12, 3]
c) last three numbers: [1, 8, 7]
d) last three numbers reverse: [7, 8, 1]
e) odd numbers: [3, 17, 1, 7]
f) number of odd numbers: 4
g) sum of odd numbers: 28
h) duplicate numbers removed: [1, 3, 4, 7, 8, 12, 17]
i) number of duplicate numbers: 2
j) ascending, de-dup (n^2) numbers: [1, 9, 16, 49, 64, 144, 289]
k) length: ODD_LIST
```
(1 Pt)
&nbsp;
### 3.) Challenge: Run Unit Tests
Unit Tests are used to *"test-a-unit"* of code in isolation. This unit can be
a function, a file, a class, a module.
In contrast to running code regularly, Unit Tests execute under the
supervision of a `test runner` that:
- looks for (discovers) tested units,
- executes them with test data, collects test results regardless
whether a test succeeded or failed and
- reports test results at the and.
Read *"A Beginner’s Guide to Unit Tests in Python"*,
[link](https://www.dataquest.io/blog/unit-tests-python/),
and answer questions:
- How are tests discovered? Which feature makes the test runner to collect
something as a test?
- What is an
[assert](https://docs.python.org/3/library/unittest.html#assert-methods)
statement? What happens when a test (assert) passes and fails?
- Where is the test runner started in given files?
(1 Pt)
Run tests in a terminal. Currently, only one test runs and passes:
*TestCase_a_number_of_numbers* :
```sh
python test_expressions.py # run tests directly from file calling the
# test runner in __main__
```
Output:
```
test_a_number_of_numbers (C_expressions.test_expressions.TestCase_a_number_of_nu
mbers.test_a_number_of_numbers) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.001s
OK
<unittest.runner.TextTestResult run=1 errors=0 failures=0>
```
Result: 1 test was performed that passed.
Alternatively, run tests with test discovery. Run the unit test module that
starts the test runner, which in turn discovers tests that are then executed:
```sh
python -m unittest # let test runner discover tests
```
Output is the same as above.
(1 Pt)
Configure your IDE so it runs Unit Tests (you can use other IDE than VS Code
that is used here as example).
VSCode discovers unit tests under the test glass icon (red circled).
The figure shows one unit test that has been discovered passing. Unit tests are
structured as *"TestCase - Classes"*, which are classes that inherit from class:
[unittest.TestCase](https://docs.python.org/3/library/unittest.html#unittest.TestCase),
in the example indirectly through class `Test_case_a`.
VSCode shows discovered test classes in the left panel and their execution result
with a green check mark when passed or a red cross when failed.
![](../markup/img/C_unit_tests_1.png)
Uncomment tests: *"Test_case_b"* and *"Test_case_c"* in `test_expressions.py`
above and re-run tests.
Both tests should fail because expressions they test have not been implemented:
![](../markup/img/C_unit_tests_2.png)
Re-run unit tests with the two tests failing in the terminal:
```sh
python -m unittest # let test runner discover tests
```
Output shows one passing and two failed tests:
```
======================================================================
FAIL: test_b_first_three_numbers (test_expressions.TestCase_b_first_three_number
s.test_b_first_three_numbers)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Sven1\svgr\workspaces\ds_cs4bd_2324\C_expressions\test_expressions.py
", line 103, in test_b_first_three_numbers
self.assertEqual(self.ut1.b, [4, 12, 3])
AssertionError: Lists differ: [] != [4, 12, 3]
Second list contains 3 additional elements.
First extra element 0:
4
- []
+ [4, 12, 3]
======================================================================
FAIL: test_c_last_three_numbers (test_expressions.TestCase_c_last_three_numbers.
test_c_last_three_numbers)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Sven1\svgr\workspaces\ds_cs4bd_2324\C_expressions\test_expressions.py
", line 117, in test_c_last_three_numbers
td.assertEqual(td.ut1.c, [1, 8, 7])
AssertionError: Lists differ: [] != [1, 8, 7]
Second list contains 3 additional elements.
First extra element 0:
1
- []
+ [1, 8, 7]
----------------------------------------------------------------------
Ran 3 tests in 0.002s
FAILED (failures=2)
```
Output says: `Ran 3 tests`, `FAILED (failures=2)`.
When tests fail, the test report tells which tests have failed and why:
- *test_b_first_three_numbers* failed in line: 103. The test expected
result: `[4, 12, 3]`, but an empty list `[]` was found in the tested
expression: `self.b` in file `expressions.py`.
- *test_c_last_three_numbers* failed in line: 117 where the test expected
result: `[1, 8, 7]`, but an empty list `[]` was found in: `self.c`
Tests refer to the `self.numbers` list: `[4, 12, 3, 8, 17, 12, 1, 8, 7]`.
(1 Pt)
&nbsp;
### 4.) Challenge: Write Expressions
In order to let tests pass, write expressions in
[expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/expressions.py)
for variables `self.b` .. `self.k` according to specification, e.g. write an
expression for `self.b` that extracts the first three numbers from `self.numbers`.
Use <b>one-line expressions</b> only.
Python's [built-in functions](https://docs.python.org/3/library/functions.html)
are allowed, but not own functions.
Tests exercise expressions with various lists. Initialization with constants
(`self.b = [4, 12, 3]`) will hence not work.
Write expression incrementally, one after the other - not all at once. Some
expressions require thinking and reading.
Once you have written an expression, uncomment the corresponding test case in
[test_expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/test_expressions.py):
and re-run the test. See if it is passing or figure out why it is failing
from the test report.
![](../markup/img/C_unit_tests_3.png)
Test cases a), b) and c) are now passing.
Continue until all tests pass.
![](../markup/img/C_unit_tests_4.png)
&nbsp;
### 5.) Challenge: Final Test and sign-off
For sign-off, change into `C_expressions` directory and copy commands into a terminal:
```sh
# Fetch test file from Gitlab and run tests for sign-off.
# The sed-command removes comments from test cases.
test_url=https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/raw/main/C_expressions/test_expressions.py
curl $test_url | \
sed -e 's/^#.*Test_case_/Test_case_/' | \
python
```
Result:
```
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7874 100 7874 0 0 55666 0 --:--:-- --:--:-- --:--:-- 56242
...........
----------------------------------------------------------------------
Ran 11 tests in 0.003s
OK
```
11 tests succeeded.
(10 Pts, 1 Pt for each test passing)
"""
Special file __init__.py marks a directory as a Python module.
The file is executed once when any .py file is imported.
//
Python unittest require the presence of (even an empty) file.
"""
# load setup module when executed in parent directory
try:
__import__('setup')
#
except ImportError:
pass
class Expressions:
""""
Fill in one-line expressions (no own functions) to initialize attributes
self.b .. self.k with specified values.
Use Python built-in functions, list expressions and list comprehension,
but NOT own functions.
Complete tasks one after another. Once you are done with one task,
uncomment test cases in test_expressions.py. Remove comments for
# Test_case_b = Test_case
# Test_case_c = Test_case
# Test_case_d = Test_case
# ...
Run tests in IDE and in a terminal:
python test_expressions.py
python -m unittest
"""
default_numbers=[4, 12, 3, 8, 17, 12, 1, 8, 7]
def __init__(self, _numbers=default_numbers):
"""
Constructor to initialize member variables.
"""
self.numbers = _numbers
# a) initialize with number of numbers: 9
self.a = len(self.numbers) # <-- given solution, insert one-line expressions below
# b) initialize with first three numbers: [4, 12, 3]
self.b = [] # <-- write expression here
# c) initialize with last three numbers: [1, 8, 7]
self.c = []
# d) initialize with last three numbers reverse: [7, 8, 1]
self.d = []
# e) initialize with odd numbers: [3, 17, 1, 7]
self.e = []
# f) initialize with number of odd numbers: 4
self.f = 0
# g) initialize with sum_ of odd numbers: 28
self.g = 0
# h) duplicate numbers removed: [4, 12, 3, 8, 17, 1, 7]
self.h = []
# i) number of duplicate numbers: 2
self.i = 0
# j) ascending list of squared numbers with no duplicates: [1, 9, 16, 49, 64, 144, 289]
self.j = []
# k) initialize with "ODD_LIST", "EVEN_LIST" or "EMPTY_LIST" depending on numbers length
self.k = "NEITHER"
# attempt to load solution module (ignore)
try:
_from, _import = 'expressions_sol', 'Stream'
mod = __import__(_from, fromlist=[_import])
mod.set_solution(self) # invoke set_solution() to replace values with solutions
#
except ImportError:
pass
def print_results(self):
print(f'\nnumbers: {self.numbers}\n#')
fmt = {
# key: (value, output string)
'a': (self.a, 'number of numbers'),
'b': (self.b, 'first three numbers'),
'c': (self.c, 'last three numbers'),
'd': (self.d, 'last three numbers reverse'),
'e': (self.e, 'odd numbers'),
'f': (self.f, 'number of odd numbers'),
'g': (self.g, 'sum of odd numbers'),
'h': (self.h, 'duplicate numbers removed'),
'i': (self.i, 'number of duplicate numbers'),
'j': (self.j, 'ascending, de-dup (n^2) numbers'),
'k': (self.k, 'length'),
}
# format output, e.g.: "b) first three numbers: [1, 4, 6]"
for k in sorted(fmt.keys()):
print(f'{k}) {fmt[k][1]}: {fmt[k][0]}')
if __name__ == '__main__':
'''
Driver code that runs when this file is directly executed.
'''
#
n1 = Expressions() # use default list
#
# 2nd object with different list
n2 = Expressions([1, 4, 6, 67, 6, 8, 23, 8, 34, 49, 67,
6, 8, 23, 37, 67, 6, 34, 19, 67, 6, 8])
#
n1.print_results()
# n2.print_results() # try also other list
"""
Run unit tests with discovery (-m) or from __main__() with verbosity level 2
- python -m unittest
- python test_expressions.py
Output with verbosity level < 2:
================================
...........
----------------------------------------------------------------------
Ran 11 tests in 0.002s
OK
<unittest.runner.TextTestResult run=11 errors=0 failures=0>
"""
import unittest
import abc # import Abstract Base Class (ABC) from module abc
from expressions import Expressions
"""
tested objects (objects "under test", "ut") as instances of the Expressions class
"""
ut1 = Expressions(Expressions.default_numbers) # [4, 12, 3, 8, 17, 12, 1, 8, 7]
ut2 = Expressions([1, 4, 6, 67, 6, 8, 23, 8, 34, 49, 67, 6, 8, 23, 37, 67, 6, 34, 19, 67, 6, 8])
ut3 = Expressions([6, 67, 6, 8, 17, 3, 6, 8])
ut4 = Expressions([8, 3, 9])
ut5 = Expressions([1, 1, 1])
ut6 = Expressions([0, 0])
ut7 = Expressions([0])
ut8 = Expressions([])
class Test_case(unittest.TestCase):
"""
Top-level class that inherits from class unittest.TestCase
and injects test data into derived classes for test cases.
Sub-classes are discovered as unit tests.
"""
def setUp(self):
self.ut1 = ut1
self.ut2 = ut2
self.ut3 = ut3
self.ut4 = ut4
self.ut5 = ut5
self.ut6 = ut6
self.ut7 = ut7
self.ut8 = ut8
# disable tests by assigning Python's Abstract Base Class (ABC) to test
# case classes, which will not be discovered as unit tests
Test_case_a = Test_case_b = Test_case_c = Test_case_d = \
Test_case_e = Test_case_f = Test_case_g = Test_case_h = \
Test_case_i = Test_case_j = Test_case_k = abc.ABC
# assign Test_case class (above) as subclass of unittest.TestCase and with
# attributes of tested objects (self.ut1...ut8)
# uncomment tests one after another as you progress with expressions
Test_case_a = Test_case # test a) passes, solution is given in numbers.py
# Test_case_b = Test_case
# Test_case_c = Test_case
# Test_case_d = Test_case
# Test_case_e = Test_case
# Test_case_f = Test_case
# Test_case_g = Test_case
# Test_case_h = Test_case
# Test_case_i = Test_case
# Test_case_j = Test_case
# Test_case_k = Test_case
class TestCase_a_number_of_numbers(Test_case_a):
#
# tests a): number of numbers tests (lengths of numbers lists)
def test_a_number_of_numbers(self):
self.assertEqual(self.ut1.a, 9)
self.assertEqual(self.ut2.a, 22)
self.assertEqual(self.ut3.a, 8)
self.assertEqual(self.ut4.a, 3)
self.assertEqual(self.ut5.a, 3)
self.assertEqual(self.ut6.a, 2)
self.assertEqual(self.ut7.a, 1)
self.assertEqual(self.ut8.a, 0)
class TestCase_b_first_three_numbers(Test_case_b):
#
# tests b): first three numbers
def test_b_first_three_numbers(self):
self.assertEqual(self.ut1.b, [4, 12, 3])
self.assertEqual(self.ut2.b, [1, 4, 6])
self.assertEqual(self.ut3.b, [6, 67, 6])
self.assertEqual(self.ut4.b, [8, 3, 9])
self.assertEqual(self.ut5.b, [1, 1, 1])
self.assertEqual(self.ut6.b, [0, 0])
self.assertEqual(self.ut7.b, [0])
self.assertEqual(self.ut8.b, [])
class TestCase_c_last_three_numbers(Test_case_c):
#
# tests c): last three numbers
def test_c_last_three_numbers(td):
td.assertEqual(td.ut1.c, [1, 8, 7])
td.assertEqual(td.ut2.c, [67, 6, 8])
td.assertEqual(td.ut3.c, [3, 6, 8])
td.assertEqual(td.ut4.c, [8, 3, 9])
td.assertEqual(td.ut5.c, [1, 1, 1])
td.assertEqual(td.ut6.c, [0, 0])
td.assertEqual(td.ut7.c, [0])
td.assertEqual(td.ut8.c, [])
class TestCase_d_last_threeClass_in_reverse(Test_case_d):
#
# tests d): last three numbers in reverse
def test_d_last_threeClass_in_reverse(td):
td.assertEqual(td.ut1.d, [7, 8, 1])
td.assertEqual(td.ut2.d, [8, 6, 67])
td.assertEqual(td.ut3.d, [8, 6, 3])
td.assertEqual(td.ut4.d, [9, 3, 8])
td.assertEqual(td.ut5.d, [1, 1, 1])
td.assertEqual(td.ut6.d, [0, 0])
td.assertEqual(td.ut7.d, [0])
td.assertEqual(td.ut8.d, [])
class TestCase_e_odd_numbers(Test_case_e):
#
# tests e): odd numbers, order must be preserved
def test_e_odd_numbers(td):
td.assertEqual(td.ut1.e, [3, 17, 1, 7])
td.assertEqual(td.ut2.e, [1, 67, 23, 49, 67, 23, 37, 67, 19, 67])
td.assertEqual(td.ut3.e, [67, 17, 3])
td.assertEqual(td.ut4.e, [3, 9])
td.assertEqual(td.ut5.e, [1, 1, 1])
td.assertEqual(td.ut6.e, [])
td.assertEqual(td.ut7.e, [])
td.assertEqual(td.ut8.e, [])
class TestCase_f_number_of_odd_numbers(Test_case_f):
#
# tests f): number of odd numbers
def test_f_number_of_odd_numbers(td):
td.assertEqual(td.ut1.f, 4)
td.assertEqual(td.ut2.f, 10)
td.assertEqual(td.ut3.f, 3)
td.assertEqual(td.ut4.f, 2)
td.assertEqual(td.ut5.f, 3)
td.assertEqual(td.ut6.f, 0)
td.assertEqual(td.ut7.f, 0)
td.assertEqual(td.ut8.f, 0)
class TestCase_g_sum_of_odd_numbers(Test_case_g):
#
# tests g): sum of odd numbers
def test_g_sum_of_odd_numbers(td):
td.assertEqual(td.ut1.g, 28)
td.assertEqual(td.ut2.g, 420)
td.assertEqual(td.ut3.g, 87)
td.assertEqual(td.ut4.g, 12)
td.assertEqual(td.ut5.g, 3)
td.assertEqual(td.ut6.g, 0)
td.assertEqual(td.ut7.g, 0)
td.assertEqual(td.ut8.g, 0)
class TestCase_h_duplicateClass_removed(Test_case_h):
#
# tests h): duplicate numbers removed - use set() to accept any order
def test_h_duplicateClass_removed(td):
td.assertEqual(set(td.ut1.h), {4, 12, 3, 8, 17, 1, 7})
td.assertEqual(set(td.ut2.h), {1, 4, 6, 67, 8, 23, 34, 49, 37, 19})
td.assertEqual(set(td.ut3.h), {6, 67, 8, 17, 3})
td.assertEqual(set(td.ut4.h), {8, 3, 9})
td.assertEqual(td.ut5.h, [1])
td.assertEqual(td.ut6.h, [0])
td.assertEqual(td.ut7.h, [0])
td.assertEqual(td.ut8.h, [])
class TestCase_i_number_of_duplicate_numbers(Test_case_i):
#
# tests i): number of duplicate numbers
def test_i_number_of_duplicate_numbers(td):
td.assertEqual(td.ut1.i, 2)
td.assertEqual(td.ut2.i, 12)
td.assertEqual(td.ut3.i, 3)
td.assertEqual(td.ut4.i, 0)
td.assertEqual(td.ut5.i, 2) # [1, 1, 1] has 2 duplicates 1
td.assertEqual(td.ut6.i, 1) # [0, 0] has one duplicate number 0
td.assertEqual(td.ut7.i, 0)
td.assertEqual(td.ut8.i, 0)
class TestCase_j_ascending_squaredClass_no_duplicates(Test_case_j):
#
# tests j): ascending list of squared numbers with no duplicates
def test_j_ascending_squaredClass_no_duplicates(td):
td.assertEqual(set(td.ut1.j), {1, 9, 16, 49, 64, 144, 289})
td.assertEqual(set(td.ut2.j), {1, 16, 36, 64, 361, 529, 1156, 1369, 2401, 4489})
td.assertEqual(set(td.ut3.j), {9, 36, 64, 289, 4489})
td.assertEqual(set(td.ut4.j), {9, 64, 81})
td.assertEqual(td.ut5.j, [1])
td.assertEqual(td.ut6.j, [0])
td.assertEqual(td.ut7.j, [0])
td.assertEqual(td.ut8.j, [])
class TestCase_k_classifyClass_as_odd_even_empty(Test_case_k):
#
# tests k): classify as "ODD_LIST", "EVEN_LIST" or "EMPTY_LIST" depending on numbers length
def test_k_classifyClass_as_odd_even_empty(td):
td.assertEqual(td.ut1.k, "ODD_LIST")
td.assertEqual(td.ut2.k, "EVEN_LIST")
td.assertEqual(td.ut3.k, "EVEN_LIST")
td.assertEqual(td.ut4.k, "ODD_LIST")
td.assertEqual(td.ut5.k, "ODD_LIST")
td.assertEqual(td.ut6.k, "EVEN_LIST")
td.assertEqual(td.ut7.k, "ODD_LIST")
td.assertEqual(td.ut8.k, "EMPTY_LIST")
if __name__ == '__main__':
unittest.main()
# Cygwin setup on *Windows*
[Cygwin](https://www.cygwin.com) is a Unix-Emulator that provides
a terminal in which Unix commands can be executed on Windows
using
[bash](https://en.wikipedia.org/wiki/Bash_(Unix_shell))
(*Bourne Again Shell*) as command-line interpreter -
a terminal ([mintty](https://mintty.en.lo4d.com/windows))
in which Unix commands can be executed on Windows using
*[bash](https://en.wikipedia.org/wiki/Bash_(Unix_shell))*
(*Bourne Again Shell*) as command-line interpreter.
*bash* was developed in 1989 as a successor to the
*Bourne Shell:*
*Bourne Shell*
*[sh](https://en.wikipedia.org/wiki/Bourne_shell)*.
[<bash terminal>](https://cdn.ttgtmedia.com/rms/onlineimages/REF_bash_command_line_3.jpg)
Example of a
[bash terminal](https://cdn.ttgtmedia.com/rms/onlineimages/REF_bash_command_line_3.jpg).
*Cygwin* is <span style="text-decoration:underline">not</span>
a Unix container or virtual machine system (unlike
a Unix system, container or virtual machine (like
[WSL](https://learn.microsoft.com/en-us/windows/wsl/about)).
*Cygwin* emulates most (not all) Unix system calls such that
most Unix commands can execute on Windows.
*Cygwin* emulates most (but not all) Unix system calls such that
most Unix commands can be used on Windows.
[GitBash](https://gitforwindows.org)
is an alternative to *Cygwin* that uses a different emulator package
[MinGW](https://www.mingw-w64.org).
It has has some flaws, for example, it performs path conversions that
may cause problems, see
[link](https://stackoverflow.com/questions/54258996/git-bash-string-parameter-with-at-start-is-being-expanded-to-a-file-path).
Alternatives to *Cygwin* such as
[GitBash]()
can be used, but have some flaws and drawbacks on Windows
(*GitBash*, for example, performs strange path conversions, see
[link](https://stackoverflow.com/questions/54258996/git-bash-string-parameter-with-at-start-is-being-expanded-to-a-file-path)).
Read
[Differences between Cygwin and MinGW](https://stackoverflow.com/questions/771756/what-is-the-difference-between-cygwin-and-mingw)
for more detail about the differences between *Cygwin* and *MinGW / GitBash*.
Good introductions to *bash* are:
- [https://cs.lmu.edu/~ray/notes/bash](https://cs.lmu.edu/~ray/notes/bash).
- [Introduction to Bash](https://cs.lmu.edu/~ray/notes/bash).
- Also recommended is the
[Tutorial for Beginners](https://linuxconfig.org/bash-scripting-tutorial-for-beginners).
- [Bash Tutorial for Beginners](https://linuxconfig.org/bash-scripting-tutorial-for-beginners).
&nbsp;
## Steps
1. [Installation](#1-installation)
2. [Configure paths in .bashrc](#2-configure-paths-in-bashrc)
1. [Install *Cygwin*](#1-install-cygwin)
2. [Configure *Cygwin* and *bash*](#2-configure-cygwin)
- switch from `/cygdrive/c` to `/c`
- select HOME-directory
- configure *.bashrc* in HOME-directory
- define PATH in *.bashrc*
3. [Customize *bash*](#3-customize-bash)
4. [References](#4-references)
&nbsp;
## 1. Installation
## 1. Install Cygwin
1. Download the installer `setup-x86_64.exe` from
[https://www.cygwin.com/install.html](https://www.cygwin.com/install.html).
......@@ -56,7 +65,7 @@ Good introductions to *bash* are:
1. Change *cygwin* default path `/cygdrive/c` to: `/c`:
- navigate to the *cygwin* installation directory.
- navigate to the *cygwin* installation directory called `cygwin64`.
- inside, edit `/etc/fstab` and replace line
```
......@@ -71,22 +80,37 @@ Good introductions to *bash* are:
- Find out your Windows user name: <user_name>
- Create or select a directory that you want to use as
HOME-directory for *bash*, e.g. your Windows
HOME-directory `C:\Users\<user_name>`.
- Create or select a directory to use as HOME-directory for
*bash*, e.g. your Windows HOME-directory `C:\Users\<user_name>`
(but also any other directory you may create as HOME).
Read
[Change Cygwin home folder after installation](https://stackoverflow.com/questions/1494658/how-can-i-change-my-cygwin-home-folder-after-installation).
- Edit file `/etc/nsswitch.conf`:
- to use Windows HOME-directory, comment line
- to use your Windows HOME-directory, comment line (put hash # in front)
```
#db_home:
```
- for using a new directory as HOME-directory, enter
```
db_home: /cygdrive/c/<path>
- to use a different HOME directory `<path>`, enter
```sh
db_home: /c/<path> # e.g. db_home: /c/users/svgr
```
1. Open a new *bash* terminal and test changes:
1. Create a terminal icon on your Desktop.
- In the installation directory (`cygwin64`), navigate to `./bin`
and find file
[mintty.exe](https://en.wikipedia.org/wiki/Mintty),
which is the terminal emulator executable.
- Create a shortcut for *mintty.exe*
(right-click file -> Create Shortcut -> On Desktop).
The terminal icon appears on your desktop.
1. Open *mintty* terminal and test:
```sh
$ whoami
......@@ -124,7 +148,11 @@ Good introductions to *bash* are:
1. Return to *bash* terminal:
- change to HOME-directory and
- open file `.bashrc`
- show content of file `.bashrc` using a text editor:
*[vim](https://www.vim.org)* (already installed with cygwin),
*[nano](https://www.nano-editor.org)* or
*[sublime](https://www.sublimetext.com)*
are good choices.
```sh
$ cd # change to bash HOME-directory
$ ls -la # find file .bashrc
......@@ -132,12 +160,12 @@ Good introductions to *bash* are:
-rwxr-xr-x+ 1 svgr2 Kein 2717 Oct 4 20:28 .bashrc
$ cat .bashrc # output file .bashrc
...
content of file .bashrc appears...
```
&nbsp;
## 2. Configure paths in `.bashrc`
## 2. Configure *Cygwin*
[PATH](https://en.wikipedia.org/wiki/PATH_(variable)) is an
environment variable on Unix-like operating systems that
......@@ -145,58 +173,80 @@ specifys a set of directories where executable programs are
located. A *"command not found"* error occurs when PATH is
not properly configured.
1. Open `.bashrc` using an editor, e.g. *vim* and append PATH
configurations.
1. Inspect the current PATH variable on your system:
```sh
echo $PATH
.:/usr/bin:/usr/local/bin:/c/WINDOWS:/c/WINDOWS/system32: ...
```
PATH has a list of directory paths starting from the root directory '/'
separated by colon ( : ).
To print PATH more readable, replace ( : ) by newlines:
```sh
echo $PATH | tr ':' '\n'
.
/usr/bin
/usr/local/bin
/c/WINDOWS
/c/WINDOWS/system32
...
```
Make sure these paths are included in PATH.
1. Open `.bashrc` using a text editor
(*[sublime](https://www.sublimetext.com)* or
*[vim](https://www.vim.org)* - already installed with
*cygwin* - are good choices).
Append lines for PATH configurations at the end.
```sh
$ vim .bashrc # open .bashrc in vim editor
$ vim .bashrc # open file .bashrc in vim editor
```
or drag file from file explorer into sublime for editing.
1. Append PATH configurations at the end of the file - only
those relevant on your system - and djust paths for your
system:
1. Append following PATH configurations at the end of the file -
only those relevant on your system adjusted for your system:
```sh
# add Windows system paths
export PATH=".:/usr/bin:/usr/local/bin"
export PATH="${PATH}:$(cygpath ${SYSTEMROOT}):$(cygpath ${SYSTEMROOT})/system32"
# add Java path
# if Java, add Java path (as first entry on PATH)
# make sure the path to Java exists on your system and is the path
# to a JDK (Java Development Kit) and not JRE, which is just the
# Java Runtime that has no compiler, no javadoc or jar.
export JAVA_HOME="/c/Program Files/Java/jdk-21"
export PATH="${PATH}:${JAVA_HOME}/bin"
export PATH="${JAVA_HOME}/bin:${PATH}"
# add Python path
# if Python, add Python path
export PYTHON_HOME="/c/Users/svgr2/AppData/Local/Programs/Python/Python312"
export PATH="${PATH}:${PYTHON_HOME}"
export PATH="${PATH}:${PYTHON_HOME}/Scripts"
# add Docker path
# if Docker, add Docker path
export DOCKER_HOME="/c/Program Files/Docker/Docker"
export PATH="${PATH}:${DOCKER_HOME}/resources/bin"
...
```
1. Verify paths have been added to *PATH* variable:
```sh
$ source .bashrc # reload .bashrc to activate chances
$ echo $PATH # show PATH
$ source .bashrc # reload .bashrc to activate PATH definitions
$ echo ${PATH} # show PATH
.:/usr/bin:/usr/local/bin:/c/WINDOWS:/c/WINDOWS/system32:/c/Program ...
$ echo $PATH | tr ':' '\n' # pretty print PATH
$ echo ${$PATH} | tr ':' '\n' # pretty print PATH
.
/c/Program Files/Java/jdk-21/bin <-- path to JDK executables
/usr/bin
/usr/local/bin
/c/WINDOWS
/c/WINDOWS/system32
/c/Program Files/Java/jdk-21/bin
/c/opt/maven/bin
/c/Users/svgr2/AppData/Local/Programs/Python/Python312
/c/Users/svgr2/AppData/Local/Programs/Python/Python312/Scripts
/c/Program Files/Docker/Docker/resources/bin
/c/Program Files/MySQL/MySQL Workbench 8.0 CE
/c/Program Files (x86)/Git/bin
/c/Program Files/Oracle/VirtualBox
/c/opt/Qt6/Tools/mingw1120_64/bin
/c/opt/Qt6/6.2.4/mingw_64/bin
...
```
Paths may vary based on your system.
......@@ -215,3 +265,32 @@ not properly configured.
$ docker --version
Docker version 24.0.6, build ed223bc
```
&nbsp;
## 3. Customize *bash*
*bash* is widely customizable, usually by settings in *.bashrc*.
- [Bash Prompt customization](https://wiki.archlinux.org/title/Bash/Prompt_customization)
- [How to Change Colors on LS in Bash](https://linuxhint.com/ls_colors_bash)
- [How to Customize and Colorize the Bash Prompt](https://www.howtogeek.com/307701/how-to-customize-and-colorize-your-bash-prompt/)
Skip when no customization is desired.
&nbsp;
## 4. References
- [What is PATH?](https://en.wikipedia.org/wiki/PATH_(variable))
- Bash tutorials:
- [Introduction to Bash](https://cs.lmu.edu/~ray/notes/bash).
- [Tutorial for Beginners](https://linuxconfig.org/bash-scripting-tutorial-for-beginners).
- [Differences between Cygwin and MinGW](https://stackoverflow.com/questions/771756/what-is-the-difference-between-cygwin-and-mingw)
- [Change Cygwin HOME directory after installation](https://stackoverflow.com/questions/1494658/how-can-i-change-my-cygwin-home-folder-after-installation).
# Assignment D: Recursive Problem Solving &nbsp; (15 Pts + 4 Extra Pts)
Recursion is not just a *"function calling itself"*, it is a way of thinking
about a class of problems that can be split into simple "*base cases"* and
remaining *"sub-problems"* that are *"self-similar"*.
A *"sub-problem"* is self-similar when it exactly looks the same as the
original problem, just smaller (e.g. reduced by one element). At some point,
the simple "*base case"* has been reached that yields a primitive solution.
A recursive *solution function* exploiting self-similarity has two phases:
1. *Reduction:* - slicing the problem (e.g. a list of numbers) into one
element (e.g. the first number) and a remaining *sub-problem*
(e.g. the list of remaining numbers).
1. *Recursion:* - invoke the same function for the *sub-problem*
until the *sub-problem* has been reduced to the *base case*.
Return the solution for the *base case*.
1. *Construction:* - results of recursive invocations are considered
as solutions of *sub-problems* and are combined with the element
that was isolated at the particular level of recursion.
While this approach is elegant from a thinking-about-problems and programming
point of view, it has cost associated for using the
[Callstack](https://en.wikipedia.org/wiki/Call_stack)
using a data structure of an abstract data type
[Stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))
for recursions.
### Challenges
- [Challenge 1:](#1-challenge-simple-recursion-sum-numbers) Simple recursion: *sum* numbers
- [Challenge 2:](#2-challenge-fibonacci-numbers) Fibonacci numbers
- [Challenge 3:](#3-challenge-permutation) Permutation
- [Challenge 4:](#4-challenge-powerset) Powerset
- [Challenge 5:](#5-challenge-find-matching-pairs) Find Matching Pairs
- [Challenge 6:](#6-challenge-combinatorial-problem-of-finding-numbers) Combinatorial Problem of Finding Numbers
- [Challenge 7:](#7-challenge-hard-problem-of-finding-numbers) Hard Problem of Finding Numbers
Points: [2, 1, 2, 2, 2, 3, 2, +4 extra pts]
File [recursion.py](recursion.py) has function headers defined for each challenge.
Use those functions and complete code.
&nbsp;
### 1.) Challenge: Simple recursion: *sum()* numbers
Computing the *sum* of numbers is most often performed *iteratively* as a loop
over given numbers and adding them in a result variable.
Solving the problem *recursively* illustrates the concept of self-similarity
and recursive problem solving.
Use the following approach:
1. *Reduction:* - split the given list of numbers into a first element (first number)
and a list of remaining numbers (*sub-problem*). Remember the first element.
1. *Recursion:* - invoke *sum()* for the list of remaining numbers until the base case
has been reached: *sum( [ ] ) = 0* or *sum( [n] )=n*.
1. *Construction:* - add the remembered element to the value returned from the
recursive invocation of *sum()*.
Complete: `sum(_numbers)` using this approach:
```py
def sum(self, _numbers) -> int:
# your code
return #...
```
Remove comment from `run_choices` and run the program:
```py
run_choices = [
1, # Challenge 1, Simple recursion: sum numbers
...
]
```
Output:
```
n1.numbers: [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8]
sum(n1.numbers): 74
```
Answer questions:
1. How many time is the *"first element"* stored?
How much memory is used for applying the function to a list of *n* numbers?
1. What is the run-time estimate for *sum()* given a list of *n* numbers?
1. How many *stack-frames* are used for a list of *n* numbers?
(2 Pts)
&nbsp;
### 2.) Challenge: Fibonacci numbers
[Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_number) were first
described in Indian mathematics as early as 200 BC in works by *Pingala* on
enumerating possible patterns of Sanskrit poetry formed from syllables of two lengths.
Italian mathematician *Leonardo of Pisa*, later known as
*[Fibonacci](https://en.wikipedia.org/wiki/Fibonacci)*,
introduced the sequence to Western European mathematics in his 1202 book
*[Liber Abaci](https://en.wikipedia.org/wiki/Liber_Abaci)*.
Numbers of the *Fibonacci sequence* are defined as: *fib(0): 0*, *fib(1): 1*, *...*
and each following number is the sum of the two preceding numbers.
Fibonacci numbers are widely found in *nature*, *science*, *social behaviors* of
populations and *arts*, e.g. they form the basis of the
[Golden Ratio](https://www.adobe.com/creativecloud/design/discover/golden-ratio.html),
which is widely used in *painting* and *photography*, see also this
[1:32min](https://www.youtube.com/watch?v=v6PTrc0z4w4) video.
<img src="../markup/img/fibonacci.jpg" alt="drawing" width="640"/>
<!-- ![image](../markup/img/fibonacci.jpg) -->
&nbsp;
Complete functions `fib(n)` and `fib_gen(n)`.
```py
def fib(self, _n) -> int:
# return value of n-th Fibonacci number
return #...
def fib_gen(self, _n):
# return a generator object that yields two lists, one with n and the
# other with corresponding fib(n)
yield #...
```
Remove comment from `run_choices` and run the program:
```py
run_choices = [
1, # Challenge 1, Simple recursion: sum numbers
2, # Challenge 2, Fibonacci numbers
...
]
```
Output:
```
n: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
fib(n): [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
```
Answer questions:
1. Explain the concept of a generator in Python.
1. Why can't `fib(60)` or `fib(90)` be computed recursively?
1. What is the more limiting constraint: memory use or needed run time?
```py
n = 30
print(f'fib({n}): {n1.fib(n)}')
n = 60
print(f'fib({n}): {n1.fib(n)}') # ??
n = 90
print(f'fib({n}): {n1.fib(n)}') # ??
```
Understand the problem and use a technique called
[memoization](https://stackoverflow.com/questions/7875380/recursive-fibonacci-memoization)
to make the solution work for *n=60* and *n=90* - still recursively (!).
Remove comments `#21` and `#22` from `run_choices` and run the program:
Output:
```
fib(30): 832040
fib(60): 1548008755920
fib(90): 2880067194370816120
```
(2 Pts)
&nbsp;
### 3.) Challenge: Permutation
[Permutation](https://en.wikipedia.org/wiki/Permutation) is a list of all
arrangements of elements.
For example:
```py
perm([1, 2, 3]) -> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
perm([]) -> [[]]
perm([1]) -> [[1]]
perm([1, 2]) -> [[1, 2], [2, 1]]
perm([1, 2, 3]) -> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
perm([1, 2, 3, 4]) -> [[1, 2, 3, 4], [1, 2, 4, 3], ... [4, 3, 1, 2], [4, 3, 2, 1]]
```
Find a pattern how numbers are arranged for `perm([1, 2])` and `perm([1, 2, 3])`
and adapt it for `perm([1, 2, 3, 4])` to understand the algorithm.
Writing non-recursive code for that algorithm can be difficult, but it fits
well with the recursive sub-problen approach, which is elegant with the
four steps:
1. Return solutions for trivial cases: `[]`, `[1]`, `[1, 2]`.
1. Split the problem by removing the first number `n1` from the list leaving `r` as
remaining list (sub-problem).
1. Invoke `perm(r)` recursively on the remaining list.
1. Combine the result returned from `perm(r)` by adding `n1` to each element.
```py
def perm(self, _numbers) -> list:
res=[] # collect result
# code...
# 1. Return solutions for trivial cases: `[]`, `[1]`, `[1, 2]`.
# 2. Split the problem by removing the first number `n1` from the list
# leaving `r` as remaining list (sub-problem).
# 3. Invoke `perm(r)` recursively on the remaining list.
# 4. Combine the result by adding `n1` to each returned element from `perm(r)`.
#
return res
lst = [1, 2, 3]
perm = n1.perm(lst)
print(f'perm({lst}) -> {perm}')
lst = [1, 2, 3, 4]
perm = n1.perm(lst)
print(f'perm({lst}) -> {perm}')
```
Remove comment `#3` from `run_choices` and run the program:
Output:
```
perm([1, 2, 3]) -> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
perm([1, 2, 3, 4]) -> [[1, 2, 3, 4], [1, 2, 4, 3], ... [4, 3, 1, 2], [4, 3, 2, 1]]
```
Answer questions:
- With a rising length of the input list, how does the number of permutations grow?
(2 Pts)
&nbsp;
### 4.) Challenge: Powerset
[Powerset](https://en.wikipedia.org/wiki/Powerset) is a list of all
subsets of elements including the empty set.
For example:
```py
pset([1, 2, 3]) -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
```
Undertstand the pattern and complete function `pset()`.
```py
def pset(self, _numbers) -> list:
res=[] # collect result
# code...
# 1. Return solutions for trivial cases: `[]`, `[1]`, `[1, 2]`.
# 2. Split the problem by removing the first number `n1` from the list
# leaving `r` as remaining list (sub-problem).
# 3. Invoke `pset(r)` recursively on the remaining list.
# 4. Combine the result with the first element.
#
return res
lst = [1, 2, 3]
pset = n1.pset(lst)
print(f'pset({lst}) -> {pset}')
```
Remove comment `#4` from `run_choices` and run the program:
Output:
```py
pset([1, 2, 3]) -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
```
Answer questions:
- With a rising length of the input list, how does the size of the Powerset grow?
(2 Pts)
&nbsp;
### 5.) Challenge: Find Matching Pairs
Write three functions to `find` elements in a list.
The first function to `find` elements that match a boolean `match_func`.
A second function `find_adjacent` that finds all indexes of adjacent pairs
of numbers.
The third function `find_pairs` that finds all pairs of numbers (not necessarily
adjacent) with the sum equal to `n`. For example, `n=12` can be combined from
the input list with pairs: `[3, 9], [4, 8], [2, 10]`.
```py
def find(self, _numbers, match_func) -> list:
res = [] # code...
return res
def find_adjacent(self, pair, _numbers) -> list:
res = [] # code...
return res
def find_pairs(self, n, _numbers) -> list:
res = [] # code...
return res
lst = [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8] # input list
#
div3 = n1.find(lst, match_func=lambda n : n % 3 == 0)
print(f'find numbers divisible by 3: {div3}')
#
p = [4, 8] # find all indexes of adjacent numbers [4, 8]
adj = n1.find_adjacent(p, lst)
print(f'find_adjacent({p}, list): {adj}')
#
n = 12 # find all pairs from the input list that add to n
pairs = n1.find_pairs(n, lst)
print(f'find_pairs({n}, list) -> {pairs}')
```
Remove comments `#5`, `#51` and `#52` from `run_choices` and run the program:
Output:
```
find numbers divisible by 3: [9, 3]
find_adjacent([4, 8], list): [1, 5, 9]
find_pairs(12, list) -> [[3, 9], [4, 8], [2, 10]]
```
Answer questions:
- With a rising length of the input list, how many steps are needed to
complete each function in the best and worst case and on average?
| function | answers |
| ---------------- | ----------- |
| `find` | best case: ______, worst case: ______, average: ______ steps. |
| `find_adjacent` | best case: ______, worst case: ______, average: ______ steps. |
| `find_pairs` | best case: ______, worst case: ______, average: ______ steps. |
(3 Pts)
&nbsp;
### 6.) Challenge: Combinatorial Problem of Finding Numbers
`find_all_sums` is a function that returns any combination of numbers from the
input list that add to `n`. For example, `n=14` can be combined from an
input list: `[8, 10, 2, 14, 4]` by combinations: `[4, 8, 2], [4, 10], [14]`.
A first approach to the problem is to understand the nature of possible
combinations from the input list. If all those combinations could be
generated, each could be tested whether their elements add to `n` and if,
collect them for the final result.
The order of numbers in solutions is not relevant (summation is commutative).
Duplicate solutions with same numbers, but in different order need be to removed.
```py
def find_all_sums(self, n, _numbers) -> list:
res = [] # code...
return res
lst = [8, 10, 2, 14, 4] # input list
n = 14
all = n1.find_all_sums(n, lst)
print(f'find_all_sums({n}, lst) -> {all}')
```
Output:
```py
find_all_sums(14, lst) -> [[4, 8, 2], [4, 10], [14]]
```
Test your solution with a larger input set:
```py
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
]
n = 469
all = n1.find_all_sums(n, lst)
print(f'find_all_sums({n}, lst) -> {all}')
```
Remove comments `#6`, and `#61` from `run_choices` and run the program:
Output:
```
find_all_sums(469, lst) -> [[179, 290], [101, 167, 201]]
```
Answer questions:
- With a rising length of the input list, how does the number of possible
solutuions rise that must be tested?
(2 Pts)
&nbsp;
### 7.) Challenge: Hard Problem of Finding Numbers
Larger data sets can no longer be solved *"brute force"* by exploring all possible
2^n combinations.
Find a solution using a recursive approach exploring a decision tree or
with tabulation.
```py
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
132, 135, 584, 432, 591, 204, 417, 405, 362, 658,
136, 751, 583, 536, 293, 493, 431, 780, 563, 703,
400, 618, 397, 320, 513, 708, 319, 317, 685, 347,
758, 439, 145, 378, 158, 384, 551, 110, 408, 648,
847, 498, 50, 19, # 64 numbers
]
n = 469
all = n1.find_all_sums(n, lst)
for i, s in enumerate(all):
print(f' - {i+1:2}: sum({sum(s)}) -> {s}')
```
Sort output by lenght of solution (use length as primary and numeric value
of first element as secondary criteria).
Remove comment `#7` if you tackled the challenge and run the program:
Output:
```
1: sum(469) -> [290, 179]
2: sum(469) -> [19, 157, 293]
3: sum(469) -> [19, 246, 204]
4: sum(469) -> [19, 318, 132]
5: sum(469) -> [19, 400, 50]
6: sum(469) -> [50, 101, 318]
7: sum(469) -> [110, 201, 158]
8: sum(469) -> [136, 201, 132]
9: sum(469) -> [145, 167, 157]
10: sum(469) -> [158, 179, 132]
11: sum(469) -> [201, 101, 167]
12: sum(469) -> [19, 101, 204, 145]
13: sum(469) -> [19, 157, 135, 158]
14: sum(469) -> [19, 179, 135, 136]
15: sum(469) -> [19, 204, 136, 110]
16: sum(469) -> [19, 290, 110, 50]
17: sum(469) -> [19, 101, 167, 132, 50]
18: sum(469) -> [19, 132, 158, 110, 50]
```
( +4 Extra Pts)
from functools import cmp_to_key
"""
Assignment_D: recursion
"""
class Recursion:
numbers = [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8] #[4, 12, 3, 8, 17, 12, 1, 3, 8, 7]
def sum(self, _numbers) -> int:
"""
Return sum of numbers using recursion.
Follow steps:
1. Return 0 for an empty list of numbers.
2. Split the problem by removing the first number `n1` from the list leaving `r` as remaining list (sub-problem).
3. Invoke `sum(r)` recursively on the remaining list.
4. Combine the result for the sub-problem with the first number `n1`: `return n1 + sum(r)`.
"""
# your code
return 0
def fib(self, _n, memo=None) -> int:
"""
Return value of n-th Fibonacci number.
- input: n=8
- output: 21
"""
# your code
return 0
def fib_gen(self, _n):
"""
Return a generator object that yields two lists, one with n and the
other with corresponding fib(n).
- input: n=16
- output: generator object that produces:
([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987])
"""
# your code
yield ([], [])
def perm(self, _numbers) -> list:
"""
Return permutation (all possible arrangements) for a given input list.
- input: [1, 2, 3]
- output: [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
"""
# your code
return []
def pset(self, _numbers) -> list:
"""
Return powerset (set of all subsets) for a given input list.
- input: [1, 2, 3]
- output: powerset, [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
"""
# your code
return []
def find(self, _numbers, match_func) -> list:
"""
Return list of elements n for which match_func(n) evaluates True.
"""
# your code
return []
def find_adjacent(self, pair, _numbers, _i=0) -> list:
"""
Return list of indexes of adjacent numbers in _numbers.
"""
# your code
return []
def find_pairs(self, n, _numbers) -> list:
"""
Return list of pairs from _numbers that add to n,
any pair, any order, no duplicates.
"""
# your code
return []
def find_all_sums(self, n, _numbers) -> list:
"""
Return all combinations of numbers in _numbers that add to n,
(any pair, any order, no duplicates).
"""
# your code
return []
def __init__(self, _numbers=numbers):
"""
Constructor to initialize member variables.
"""
self.numbers = _numbers
run_choices = [
1, # Challenge 1, Simple recursion: sum numbers
2, # Challenge 2, Fibonacci numbers
# 21, # Challenge 2.1, fig_gen()
# 22, # Challenge 2.2, memoization, fib(60), fib(90)
# 3, # Challenge 3, Permutation
# 4, # Challenge 4, Powerset
# 5, # Challenge 5, Finding matches, find()
# 51, # Challenge 5.1, find_adjacent() pairs
# 52, # Challenge 5.2, find_pairs() that add to n
# 6, # Challenge 6, Find all combinations that add to n
# 61, # Challenge 6.1, Find all in medium set
# 7 # Challenge 7, Hard problem finding numbers (extra points)
]
# Ignore this code that loads solution from file, if exists.
# The solution is not distributed.
try:
_from, _import = 'recursion_sol', 'Recursion'
Recursion = getattr(__import__(_from, fromlist=[_import]), _import)
#
except ImportError:
pass
if __name__ == '__main__':
"""
Main driver that runs when this file is executed by Python interpreter.
"""
run_choices = Recursion.run_choices
numbers = [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8]
n1 = Recursion(numbers)
print(f'n1.numbers: {n1.numbers}')
# Challenge 1, Simple recursion: sum numbers
if 1 in run_choices:
s = n1.sum(n1.numbers)
print(f'sum(n1.numbers): {s}')
# Challenge 2, Fibonacci numbers
if 2 in run_choices:
n = 30
print(f'\nfib({n}): {n1.fib(n)}')
# Challenge 2.1, fig_gen()
if 21 in run_choices:
gen = n1.fib_gen(20) # yield generator object
n, fib = next(gen) # trigger generator
print(f'n: {n}')
print(f'fib(n): {fib}')
# Challenge 2.2, memoization, fib(60), fib(90)
if 22 in run_choices:
n = 60
print(f'fib({n}): {n1.fib(n)}') # ??
n = 90
print(f'fib({n}): {n1.fib(n)}') # ??
# Challenge 3, Permutation
if 3 in run_choices:
lst = [1, 2, 3]
perm = n1.perm(lst)
print(f'\nperm({lst}) -> {perm}')
# Challenge 4, Powerset
if 4 in run_choices:
lst = [1, 2, 3]
pset = n1.pset(lst)
print(f'\npset({lst}) -> {pset}')
lst = n1.numbers
#
# Challenge 5, Finding matches, find()
if 5 in run_choices:
div3 = n1.find(lst, match_func=lambda n : n % 3 == 0)
print(f'\nfind numbers divisible by 3: {div3}')
# Challenge 5.1, find_adjacent() pairs
if 51 in run_choices:
pair = [4, 8]
adj = n1.find_adjacent(pair, lst)
print(f'find_adjacent({pair}, list): {adj}')
# Challenge 5.2, find_pairs() that add to n
if 52 in run_choices:
n = 12
pairs = n1.find_pairs(n, lst)
print(f'find_pairs({n}, list) -> {pairs}')
lst = [8, 10, 2, 14, 4] # input list
#
# Challenge 6, Find all combinations that add to n
if 6 in run_choices:
print(f'\nlist: {lst}\n\\\\')
n = 14
all = n1.find_all_sums(n, lst)
print(f'find_all_sums({n}, lst) -> {all}')
#
n = 20
all = n1.find_all_sums(n, lst)
print(f' - find_all_sums({n}, lst) -> {all}')
#
n = 32
all = n1.find_all_sums(n, lst)
print(f' - find_all_sums({n}, lst) -> {all}')
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
]
#
# Challenge 6.1, Find all in medium set
if 61 in run_choices:
print(f'\nlist({len(lst)}): {lst}\n\\\\')
n = 101 + 201 + 167 # 469 -> [[179, 290], [101, 167, 201]]
all = n1.find_all_sums(n, lst)
for i, s in enumerate(all):
print(f' {i+1:2}: find_all_sums({sum(s)}) -> {s}')
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
132, 135, 584, 432, 591, 204, 417, 405, 362, 658,
136, 751, 583, 536, 293, 493, 431, 780, 563, 703,
400, 618, 397, 320, 513, 708, 319, 317, 685, 347,
758, 439, 145, 378, 158, 384, 551, 110, 408, 648,
847, 498, 50, 19, # 64 numbers
]
# Challenge 7, Hard problem finding numbers (extra points)
if 7 in run_choices:
print(f'\nlist({len(lst)}) with {len(lst)} numbers.\n\\\\')
n = 101 + 201 + 167 # 469
all = n1.find_all_sums(n, lst)
#
sort_cpm = lambda x, y: -1 if len(x) < len(y) else 1 if len(x) > len(y) else \
-1 if x <= y else 1 if x > y else 0
all.sort(key=cmp_to_key(sort_cpm)) # sort by len(solution)
#
for i, s in enumerate(all):
print(f' {i+1:2}: find_all_sums({sum(s)}) -> {s}')
print()
# #
# n = 899 # 720 + 179, [[720, 179], [260, 179, 157, 303], [167, 289, 153, 290], [289, 153, 457]]
# n = 6240
\ No newline at end of file
{
"python.testing.unittestArgs": [
"-v",
"-s",
".",
"-p",
"test_*.py"
],
"python.testing.pytestEnabled": false,
"python.testing.unittestEnabled": true
}
\ No newline at end of file
# Assignment E: Data Streams in Python &nbsp; (12 Pts)
### Challenges
- [Challenge 1:](#1-challenge-data-streams-in-python) Data Streams in Python
- [Challenge 2:](#2-challenge-map-function) *map()* function
- [Challenge 3:](#3-challenge-reduce-function) *reduce()* function
- [Challenge 4:](#4-challenge-sort-function) *sort()* function
- [Challenge 5:](#5-challenge-pipeline-for-product-codes) Pipeline for Product Codes
- [Challenge 6:](#6-challenge-run-unit-tests) Run Unit Tests
- [Challenge 7:](#7-challenge-sign-off) Sign-off
Points: [1, 2, 3, 3, 2, 0, 1]
&nbsp;
### 1.) Challenge: Data Streams in Python
Data streams are powerful abstractions for data-driven applications that also work in distributed environments. Big Data platforms often build on streams such as
[Spark Streams](https://spark.apache.org/docs/latest/streaming-programming-guide.html) or
[Kafka](https://kafka.apache.org/documentation/streams).
A data stream starts with a *source* (here just a list of names) followed by a pipeline of *chainable operations* performed on each data element passing through the stream. Results can be collected at the *terminus* of the stream.
Pull Python file [stream.py](stream.py).
```py
class Stream:
"""
Class of a data stream comprised of a sequence of stream operations:
"""
class __Stream_op:
"""
Inner class of one stream operation with chainable functions.
Instances comprise the stream pipeline.
"""
def slice(self, i1, i2=None, i3=1):
# function that returns new __Stream_op instance that slices stream
if i2 == None:
i2, i1 = i1, 0
#
return self.__new(self.__data[i1:i2:i3])
def filter(self, filter_func=lambda d : True) ...
# return new __Stream_op instance that passes only elements for
# which filter_func yields True
def map(self, map_func=lambda d : d) ...
# return new __Stream_op instance that passes elements resulting
# from map_func of corresponding elements in the inbound stream
def reduce(self, reduce_func, start=0) -> any: ...
# terminal function that returns single value compounded by reduce_func
def sort(self, comperator_func=lambda d1, d2 : True) ...
# return new __Stream_op instance that passes stream sorted by
# comperator_func
def cond(self, cond: bool, conditional): ...
# return same __Stream_op instance or apply conditional function
# on __Stream_op instance if condition yields True
def print(self) ...
# return same, unchanged __Stream_op instance and print as side effect
def count(self) -> int: ...
# terminal function that returns number of elements in terminal stream
def get(self) -> any: ...
# terminal function that returns final stream __data
```
Application of the stream can demonstrated by the example of a stream of names. The stream is instantiated from the `names` list. The `source()` - method returns the first `__Stream_op` - instance onto which chainable stream methods can be attached.
The stream in the example filters names of lenght = 4, prints those names and counts their number. The *lambda*-expression controls the filter process. Only names of length 4 pass to subsequent pipeline operations.
```py
names = ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez',
'Howe', 'Ray', 'Navarro', 'Talley', 'Bernard', 'Gomez', 'Hamilton',
'Case', 'Petty', 'Lott', 'Casey', 'Hall', 'Pena', 'Witt', 'Joyner',
'Raymond', 'Crane', 'Hendricks', 'Vance', 'Cleveland', 'Duncan', 'Soto',
'Brock', 'Graham', 'Nielsen', 'Rutledge', 'Strong', 'Cox']
result = Stream(names).source() \
.filter(lambda n : len(n) == 4) \
.print() \
.count()
print(f'found {result} names with 4 letters.')
```
Output:
```c++
['Gill', 'Howe', 'Case', 'Lott', 'Hall', 'Pena', 'Witt', 'Soto']
found 8 names with 4 letters.
```
**Questions:**
- How does method chaining work?
- What is required for chainable methods?
- How does a data pipeline gets formed in the example?
- Draw a sketch of data objects and how they are linked from the example above.
(1 Pts)
&nbsp;
### 2.) Challenge: *map()* function
Complete the `map()` function in [stream.py](stream.py) so that the example produces
the desired result: Names are mapped to name lengths for the first 8 names.
Name lengths are then compounded to a single result.
```py
result = Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print()
```
Output:
```c++
['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez', 'Howe', 'Ray']
[8, 4, 6, 10, 7, 7, 4, 3]
```
(2 Pts)
&nbsp;
### 3.) Challenge: *reduce()* function
Complete the `reduce()` function in [stream.py](stream.py) so that name lengths are
compounded (added one after another) to a single result.
```py
result = Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print() \
.reduce(lambda x, y : x + y)
#
print(f'compound number of letters in names is: {result}.')
```
Output:
```c++
['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez', 'Howe', 'Ray']
[8, 4, 6, 10, 7, 7, 4, 3]
compound number of letters in names is: 49.
```
(2 Pts)
3.1) Test your implementation to also work for the next example that produces
a single string of all n-letter names:
```py
n = 5
result = Stream(names).source() \
.filter(lambda name : len(name) == n) \
.print() \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
#
print(f'compounded {n}-letter names: {result}.')
```
Output for n=3 and n=5:
```c++
['Ray', 'Cox']
compounded 3-letter names: RAYCOX.
['Gomez', 'Petty', 'Casey', 'Crane', 'Vance', 'Brock']
compounded 5-letter names: GOMEZPETTYCASEYCRANEVANCEBROCK.
```
(1 Pts)
&nbsp;
### 4.) Challenge: *sort()* function
Complete the `sort()` function in [stream.py](stream.py) so that the example produces
the desired result (use Python's built-in `sort()` or `sorted()` functions).
```py
Stream(names).source() \
.slice(8) \
.print('unsorted: ') \
.sort() \
.print(' sorted: ')
```
Output:
```c++
unsorted: ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez', 'Howe', 'Ray']
sorted: ['Buckner', 'Gill', 'Gonzalez', 'Hardin', 'Howe', 'Marquez', 'Ray', 'Richardson']
```
(1 Pts)
4.1) Understand the sorted sequence below and define a `comperator` (expression that compares two elements (n1, n2) and yields `-1` if n1 should come before n2, `+1` if n1 must be after n2 or `0` if n1 is equal to n2):
```py
len_alpha_comperator = lambda ...
Stream(names).source() \
.sort(len_alpha_comperator) \
.print('sorted: ')
```
Output:
```c++
sorted: ['Cox', 'Ray', 'Case', 'Gill', 'Hall', 'Howe', 'Lott', 'Pena', 'Soto', 'Witt', 'Brock', 'Casey', 'Crane', 'Gomez', 'Petty', 'Vance', 'Duncan', 'Graham', 'Hardin', 'Joyner', 'Strong', 'Talley', 'Bernard', 'Buckner', 'Marquez', 'Navarro', 'Nielsen', 'Raymond', 'Gonzalez', 'Hamilton', 'Rutledge', 'Cleveland', 'Hendricks', 'Richardson']
```
(1 Pts)
4.2) Extend the pipeline so that it produces the following output:
```c++
sorted: [('Cox', 'Xoc', 3), ('Ray', 'Yar', 3), ('Brock', 'Kcorb', 5), ('Casey', 'Yesac', 5), ('Crane', 'Enarc', 5), ('Gomez', 'Zemog', 5), ('Petty', 'Yttep', 5), ('Vance', 'Ecnav', 5), ('Bernard', 'Dranreb', 7), ('Buckner', 'Renkcub', 7), ('Marquez', 'Zeuqram', 7), ('Navarro', 'Orravan', 7), ('Nielsen', 'Neslein', 7), ('Raymond', 'Dnomyar', 7), ('Cleveland', 'Dnalevelc', 9), ('Hendricks', 'Skcirdneh', 9)]
\\
16 odd-length names found.
```
(1 Pts)
&nbsp;
### 5.) Challenge: Pipeline for Product Codes
Build a pipeline that produces batches of five 6-digit numbers with prefix 'X'.
Numbers are in ascending order within each batch and end with a 1-digit checksum
after a dash. The checksum is the sum of all six digits of the random number modulo 10.
```py
for i in range(1, 5):
# Stream of 5 random numbers from integer range, feel free to change
codes = Stream([random.randint(100000,999999) for j in range(5)]).source() \
... \
.get()
#
print(f'batch {i}: {codes}')
```
Output:
```c++
batch 1: ['X102042-9', 'X102180-2', 'X103228-6', 'X104680-9', 'X106782-4']
batch 2: ['X200064-2', 'X200732-4', 'X202090-3', 'X209056-2', 'X211464-8']
batch 3: ['X300186-8', 'X301416-5', 'X305962-5', 'X307938-0', 'X312524-7']
batch 4: ['X400216-3', 'X401436-8', 'X401682-1', 'X405256-2', 'X406376-6']
```
(1 Pts)
5.1) Alter the pipeline such that it produces only even digit codes:
```c++
batch 1: ['X226840-2', 'X284240-0', 'X448288-4', 'X804080-0', 'X888620-2']
batch 2: ['X220640-4', 'X248066-6', 'X648466-4', 'X680404-2', 'X882868-0']
batch 3: ['X262626-4', 'X608662-8', 'X626404-2', 'X662424-4', 'X846228-0']
batch 4: ['X224200-0', 'X282204-8', 'X448426-8', 'X600282-8', 'X802882-8']
```
(1 Pts)
&nbsp;
### 6.) Challenge: Run Unit Tests
Pull file
[test_stream.py](test_stream.py)
into same directory. Run unit tests to confirm the correctness of your solution.
```sh
cd E_streams # change to directory where stream.py and test_strean.py are
test_url=https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/raw/main/E_streams/test_stream.py
curl -O $(echo $test_url) # download file with unit tests from URL
python test_stream.py # run tests from test file
python -m unittest --verbose # run unit tests with discovery
```
Output:
```sh
Ran 12 tests in 0.001s
OK
Unit testing using test objects:
- test_filter_1()
- test_filter_11()
- test_filter_12()
- test_filter_13()
- test_map_2()
- test_map_21()
- test_reduce_3()
- test_reduce_31()
- test_sort_4()
- test_sort_41()
- test_sort_42()
- test_stream_generation()
---> 12/12 TESTS SUCCEEDED
```
&nbsp;
### 7.) Challenge: Sign-off
For sign-off, change into `E_streams` directory and copy commands into a terminal:
```sh
test_url=https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/raw/main/E_streams/test_stream.py
curl $test_url | python # run tests from URL (use for sign-off)
```
Result:
```
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 6264 100 6264 0 0 38354 0 --:--:-- --:--:-- --:--:-- 38666
Unit testing using test objects:
<stdin>:153: DeprecationWarning: unittest.makeSuite() is deprecated and will be
removed in Python 3.13. Please use unittest.TestLoader.loadTestsFromTestCase() i
nstead.
----------------------------------------------------------------------
Ran 12 tests in 0.001s
OK
```
12 tests succeeded.
(1 Pts)
"""
Special file __init__.py marks a directory as a Python module.
The file is executed once when any .py file is imported.
//
Python unittest require the presence of (even an empty) file.
"""
# load setup module when executed in parent directory
try:
__import__('setup')
#
except ImportError:
pass
import random
"""
Class of a data stream comprised of a sequence of stream operations:
- slice(i1, i2, i3) # slice stream in analogy to python slicing
- filter(filter_func) # pass only elements for which filter_func yields True
- map(map_func) # pass stream where each element is mapped by map_func
- sort(comperator_func) # pass stream sorted by comperator_func
- cond(cond, cond_func) # pass stream or apply conditional function
- print() # pass unchanged stream and print as side effect
and with terminal functions:
- reduce(reduce_func, start) # compound stream to single value with reduce_func
- count() # return number of elements in terminal stream
- get() # return final stream data
"""
class Stream:
def __init__(self, _data=[]):
# constructor to initialize instance member variables
#
self.__streamSource = self.__new_op(_data)
class __Stream_op:
"""
Inner class of one stream operation with chainable functions.
Instances comprise the stream pipeline.
"""
def __init__(self, _new_op_func, _data):
self.__data = _data
self.__new = _new_op_func # __new_op() function injected from outer context
def slice(self, i1, i2=None, i3=1):
# function that returns new __Stream_op instance that slices stream
if i2 == None:
# flip i1, i2 for single arg, e.g. slice(0, 8), slice(8)
i2, i1 = i1, 0
#
# return new __Stream_op instance with sliced __data
return self.__new(self.__data[i1:i2:i3])
def filter(self, filter_func=lambda d : True):
# return new __Stream_op instance that passes only elements for
# which filter_func yields True
#
return self.__new([d for d in self.__data if filter_func(d)])
def map(self, map_func=lambda d : d):
# return new __Stream_op instance that passes elements resulting
# from map_func of corresponding elements in the inbound stream
#
# input data is list of current instance: self.__data
# mapping means a new list needs to be created with same number of
# elements, each obtained by applying map_func
# create new data for next __Stream_op instance from current instance
# data: self.__data
new_data = self.__data # <-- compute new data here
# create new __Stream_op instance with new stream data
new_stream_op_instance = self.__new(new_data)
return new_stream_op_instance
def reduce(self, reduce_func=lambda compound, d : compound + d, start=0) -> any:
# terminal function that returns single value compounded by reduce_func
#
compound = 0 # <-- compute compound result here
return compound
def sort(self, comperator_func=lambda n1, n2 : -1 if n1 < n2 else 1):
# return new __Stream_op instance that passes stream sorted by
# comperator_func
#
# create new data for next __Stream_op instance from current instance
# data: self.__data
new_data = self.__data # <-- compute new data here
# create new __Stream_op instance with new stream data
new_stream_op_instance = self.__new(new_data)
return new_stream_op_instance
def cond(self, cond: bool, conditional):
# return same __Stream_op instance or apply conditional function
# on __Stream_op instance if condition yields True
#
return conditional(self) if cond else self
def print(self, prefix=''):
# return same, unchanged __Stream_op instance and print as side effect
#
print(f'{prefix}{self.__data}')
return self
def count(self) -> int:
# terminal function that returns number of elements in terminal stream
#
return len(self.__data)
def get(self) -> any:
# terminal function that returns final stream __data
#
return self.__data
def source(self):
# return first __Stream_op instance of stream as source
#
return self.__streamSource
def __new_op(self, *argv):
# private method to create new __Stream_op instance
return Stream.__Stream_op(self.__new_op, *argv)
# attempt to load solution module (ignore)
try:
_from, _import = 'stream_sol', 'Stream'
# fetch Stream class from solution, if present
Stream = getattr(__import__(_from, fromlist=[_import]), _import)
#
except ImportError:
pass
if __name__ == '__main__':
run_choice = 3
#
run_choices = {
1: "Challenge 1, Data streams in Python, run the first example",
2: "Challenge 2, complete map() function",
3: "Challenge 3, complete reduce() function",
31: "Challenge 3.1, example RAYCOX",
4: "Challenge 4, complete sort() function",
41: "Challenge 4.1, len-alpha comperator",
42: "Challenge 4.2, tuple output: ('Cox', 'Xoc', 3)",
5: "Challenge 5, Pipeline for product codes",
51: "Challenge 5.1, even digit codes"
}
names = ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez',
'Howe', 'Ray', 'Navarro', 'Talley', 'Bernard', 'Gomez', 'Hamilton',
'Case', 'Petty', 'Lott', 'Casey', 'Hall', 'Pena', 'Witt', 'Joyner',
'Raymond', 'Crane', 'Hendricks', 'Vance', 'Cleveland', 'Duncan', 'Soto',
'Brock', 'Graham', 'Nielsen', 'Rutledge', 'Strong', 'Cox']
if run_choice == 1:
# Challenge 1, Data streams in Python, run the first example
result = Stream(names).source() \
.filter(lambda n : len(n) == 4) \
.print() \
.count()
#
print(f'found {result} names with 4 letters.')
if run_choice == 2:
# Challenge 2, complete map() function
# to map names to name lengths for the first 8 names
Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print()
if run_choice == 3:
# Challenge 3, complete reduce() function
# to compound all name lengths to a single result
result = Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print() \
.reduce(lambda x, y : x + y)
#
print(f'compound number of letters in names is: {result}.')
if run_choice == 31:
# Challenge 3.1, example RAYCOX
# compound single string of all n-letter names
n = 3
result = Stream(names).source() \
.filter(lambda name : len(name) == n) \
.print() \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
#
print(f'compounded {n}-letter names: {result}.')
if run_choice == 4:
# Challenge 4, complete sort() function
Stream(names).source() \
.slice(8) \
.print('unsorted: ') \
.sort() \
.print(' sorted: ')
alpha_comperator = lambda n1, n2 : -1 if n1 < n2 else 1
len_alpha_comperator = lambda n1, n2 : -1 if len(n1) < len(n2) else 1 if len(n1) > len(n2) else alpha_comperator(n1, n2)
#
if run_choice == 41:
# Challenge 4.1, len-alpha comperator
Stream(names).source() \
.sort(len_alpha_comperator) \
.print('sorted: ')
if run_choice == 42:
# Challenge 4.2, tuple output: ('Cox', 'Xoc', 3)
result = Stream(names).source() \
.sort(len_alpha_comperator) \
.map(lambda n : (n, n[::-1].capitalize(), len(n))) \
.filter(lambda n1 : n1[2] % 2 == 1) \
.print('sorted: ') \
.count()
#
print(f'\\\\\n{result} odd-length names found.')
# rand_numbers = [random.randint(100000,999999) for i in range(30)]
# print(f'random numbers: {rand_numbers}')
#
if run_choice == 5 or run_choice == 51:
# Challenge 5, Pipeline for product codes
# Challenge 5.1, even digit codes
#
for i in range(1, 5):
# Stream of 5 random numbers from integer range, feel free to change
codes = Stream([random.randint(100000,999999) for j in range(1000)]).source() \
.filter(lambda n : n % 2 == 0) \
.cond( run_choice == 51, \
# use only numbers with even digits, test by split up number in sequence of digits
lambda op : op.filter(lambda n : len(set(map(int, str(n))).intersection([1, 3, 5, 7, 9])) == 0) \
) \
.slice(5) \
.sort() \
.map(lambda n : f'X{n}-{sum(list(map(int, str(n)))) % 10}') \
.get()
#
print(f'batch {i}: {codes}')
import unittest
from stream import Stream
class Stream_test(unittest.TestCase):
"""
Test class.
"""
list_1 = [4, 12, 3, 8, 17, 12, 1, 8, 7]
list_1_str = [str(d) for d in list_1]
names = ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez',
'Howe', 'Ray', 'Navarro', 'Talley', 'Bernard', 'Gomez', 'Hamilton',
'Case', 'Petty', 'Lott', 'Casey', 'Hall', 'Pena', 'Witt', 'Joyner',
'Raymond', 'Crane', 'Hendricks', 'Vance', 'Cleveland', 'Duncan', 'Soto',
'Brock', 'Graham', 'Nielsen', 'Rutledge', 'Strong', 'Cox']
# tests for stream generation function
def test_stream_generation(self):
#
result = Stream(self.list_1).source() \
.get()
self.assertEqual(self.list_1, result)
# tests for filter() function
def test_filter_1(self):
#
# test Challenge 1
result = Stream(self.list_1).source() \
.filter(lambda n : n % 2 == 1) \
.get()
self.assertEqual([3, 17, 1, 7], result)
def test_filter_11(self):
result = Stream(self.list_1).source() \
.filter(lambda d : False) \
.get()
self.assertEqual([], result)
def test_filter_12(self):
result = Stream(self.list_1).source() \
.filter(lambda d : True) \
.get()
self.assertEqual(self.list_1, result)
def test_filter_13(self):
result = Stream(self.names).source() \
.filter(lambda n : len(n) == 4) \
.get()
self.assertEqual(['Gill', 'Howe', 'Case', 'Lott', 'Hall', 'Pena', 'Witt', 'Soto'], result)
# tests for map() function
def test_map_2(self):
#
# test Challenge 2
result = Stream(self.names).source() \
.slice(8) \
.map(lambda n : len(n)) \
.get()
self.assertEqual([8, 4, 6, 10, 7, 7, 4, 3], result)
def test_map_21(self):
result = Stream(self.names).source() \
.filter(lambda n : len(n) == 3) \
.map(lambda n : (n, len(n))) \
.get()
self.assertEqual([('Ray', 3), ('Cox', 3)], result)
# tests for reduce() function
def test_reduce_3(self):
#
# test Challenge 3
result = Stream(self.names).source() \
.slice(8) \
.map(lambda n : len(n)) \
.reduce(lambda x, y : x + y)
self.assertEqual(49, result)
def test_reduce_31(self):
# test Challenge 3.1
n = 3
result = Stream(self.names).source() \
.filter(lambda name : len(name) == n) \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
self.assertEqual('RAYCOX', result)
#
n = 5
result = Stream(self.names).source() \
.filter(lambda name : len(name) == n) \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
self.assertEqual('GOMEZPETTYCASEYCRANEVANCEBROCK', result)
# tests for sort() function
def test_sort_4(self):
# test Challenge 4
result = Stream(self.names).source() \
.slice(8) \
.sort() \
.get()
expected = ['Buckner', 'Gill', 'Gonzalez', 'Hardin', 'Howe', 'Marquez', 'Ray', 'Richardson']
self.assertEqual(expected, result)
def alpha_comperator(self, n1, n2):
return -1 if n1 < n2 else 1
def len_alpha_comperator(self, n1, n2):
return -1 if len(n1) < len(n2) else 1 if len(n1) > len(n2) else self.alpha_comperator(n1, n2)
def test_sort_41(self):
# test Challenge 4.1
result = Stream(self.names).source() \
.sort(self.len_alpha_comperator) \
.get()
#
expected = ['Cox', 'Ray', 'Case', 'Gill', 'Hall', 'Howe', 'Lott', 'Pena', 'Soto', 'Witt',
'Brock', 'Casey', 'Crane', 'Gomez', 'Petty', 'Vance', 'Duncan', 'Graham', 'Hardin',
'Joyner', 'Strong', 'Talley', 'Bernard', 'Buckner', 'Marquez', 'Navarro', 'Nielsen',
'Raymond', 'Gonzalez', 'Hamilton', 'Rutledge', 'Cleveland', 'Hendricks', 'Richardson'
]
self.assertEqual(expected, result)
def test_sort_42(self):
# test Challenge 4.2
result = Stream(self.names).source() \
.sort(self.len_alpha_comperator) \
.map(lambda n : (n, n[::-1].capitalize(), len(n))) \
.filter(lambda n1 : n1[2] % 2 == 1) \
.get()
#
expected = [('Cox', 'Xoc', 3), ('Ray', 'Yar', 3), ('Brock', 'Kcorb', 5), ('Casey', 'Yesac', 5),
('Crane', 'Enarc', 5), ('Gomez', 'Zemog', 5), ('Petty', 'Yttep', 5), ('Vance', 'Ecnav', 5),
('Bernard', 'Dranreb', 7), ('Buckner', 'Renkcub', 7), ('Marquez', 'Zeuqram', 7),
('Navarro', 'Orravan', 7), ('Nielsen', 'Neslein', 7), ('Raymond', 'Dnomyar', 7),
('Cleveland', 'Dnalevelc', 9), ('Hendricks', 'Skcirdneh', 9)
]
self.assertEqual(expected, result)
#
result = Stream(self.names).source() \
.sort(self.len_alpha_comperator) \
.map(lambda n : (n, n[::-1].capitalize(), len(n))) \
.filter(lambda n1 : n1[2] % 2 == 1) \
.count()
self.assertEqual(16, result)
# report results,
# see https://stackoverflow.com/questions/28500267/python-unittest-count-tests
# currentResult = None
# @classmethod
# def setResult(cls, amount, errors, failures, skipped):
# cls.amount, cls.errors, cls.failures, cls.skipped = \
# amount, errors, failures, skipped
# def tearDown(self):
# amount = self.currentResult.testsRun
# errors = self.currentResult.errors
# failures = self.currentResult.failures
# skipped = self.currentResult.skipped
# self.setResult(amount, errors, failures, skipped)
# @classmethod
# def tearDownClass(cls):
# print("\ntests run: " + str(cls.amount))
# print("errors: " + str(len(cls.errors)))
# print("failures: " + str(len(cls.failures)))
# print("success: " + str(cls.amount - len(cls.errors) - len(cls.failures)))
# print("skipped: " + str(len(cls.skipped)))
# def run(self, result=None):
# self.currentResult = result # remember result for use in tearDown
# unittest.TestCase.run(self, result) # call superclass run method
if __name__ == '__main__':
result = unittest.main()
# Assignment F: Graph Data &nbsp; (10 Pts)
### Challenges
- [Challenge 1:](#1-challenge-understanding-graph-data) Understanding Graph Data
- [Challenge 2:](#2-challenge-representing-graph-data-in-python) Representing Graph Data in Python
- [Challenge 3:](#3-challenge-implementing-the-graph-in-python) Implementing the Graph in Python
- [Challenge 4:](#4-challenge-implementing-dijkstras-shortest-path-algorithm) Implementing Dijkstra's Shortest Path Algorithm
- [Challenge 5:](#5-challenge-run-for-another-graph) Run for Another Graph
Points: [1, 1, 2, 4, 2]
&nbsp;
### 1.) Challenge: Understanding Graph Data
A *[Graph](https://en.wikipedia.org/wiki/Graph_theory)*
is a set of nodes (vertices) and edges connecting nodes G = { n ∈ N, e ∈ E }.
A *weighted Graph* has a *weight* (number) associated to each egde.
A *[Path](https://en.wikipedia.org/wiki/Path_(graph_theory))*
is a subset of edges that connects a subset of nodes.
We consider Complete Graphs where all nodes can be reached from any other
node by at least one path (no disconnected subgraphs).
Graphs may have cycles (paths that lead to nodes visited before) or
paths may join at nodes that are part of other paths.
Traversal is the process of visiting each node of a graph exactly once.
Multiple visits of graph nodes by cycles or joins must be detected by
marking visited nodes (which is not preferred since it alters the data set)
or by keeping a separate record of visits.
Write two properties that distinguish graphs from trees.
(1 Pt)
&nbsp;
### 2.) Challenge: Representing Graph Data in Python
Python has no built-in data type that supports graph data.
Separate packages my be used such as
[NetworkX](https://networkx.org/).
In this assignment, we focus on basic Python data structures.
1. How can Graphs be represented in general?
1. How can these by implemented using Python base data structures?
1. Which data structure would be efficient giving the fact that in the
example below that graph is constant and only traversal operations
are performed?
(1 Pt)
&nbsp;
### 3.) Challenge: Implementing the Graph in Python
Watch the video and understand how
[Dijkstra's Shortest Path Algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm)
(1956) works and which information it needs.
*Edsger W. Dijkstra* (1930-2003,
[bio](https://en.wikipedia.org/wiki/Edsger_W._Dijkstra))
was a Dutch computer scientist, programmer and software engineer.
He was a professor of Computer Science at the Univerity of Austin, Texas
and has received numerous awards, including the
[Turing Award](https://en.wikipedia.org/wiki/Turing_Award)
in 1972.
<!--
[video (FelixTechTips)](https://youtu.be/bZkzH5x0SKU?si=n8Z2ZIfbB73_v1TE)
<img src="../markup/img/graph_2a.jpg" alt="drawing" width="640"/>
-->
[Video (Mike Pound, Computerphile)](https://youtu.be/GazC3A4OQTE?si=ZuBEcWaBzuKmPMqA)
<img src="../markup/img/graph_1.jpg" alt="drawing" width="640"/>
Node `S` forms the start of the algorithm, node `E` is the destination.
Draw a sketch of the data structures needed to represent the graph with
nodes, edges and weights and also the data needed for the algorithm.
Create a Python file `shortest_path.py` with
- declarations of data structures you may need for the graph and
information for the algorithm and
- data to represent the graph in the video with nodes: {A ... K, S} and
the shown edges with weights.
(2 Pts)
&nbsp;
### 4.) Challenge: Implementing Dijkstra's Shortest Path Algorithm
Implement Dijkstra's Algorithm.
Output the sortest path as sequence of nodes, followed by an analysis and
the shortest distance.
```
shortest path: S -> B -> H -> G -> E
analysis:
S->B(2)
B->H(1)
H->G(2)
G->E(2)
shortest distance is: 7
```
(4 Pts)
&nbsp;
### 5.) Challenge: Run for Another Graph
Run your algorithm for another graph G: {A ... F} with weights:
```
G: {A, B, C, D, E, F}, start: A, end: C
Weights:
AB(2), BE(6), EC(9), AD(8), BD(5),
DE(3), DF(2), EF(1), FC(3)
```
Output the result:
```
shortest path: A -> B -> D -> F -> C
analysis:
S->B(2)
B->D(5)
D->F(2)
F->C(3)
shortest distance is: 12
```
(2 Pts)