Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • sgraupner/ds_cs4bd_2324
  • akbi5459/ds_cs4bd_2324
  • s90907/ds_cs4bd_2324
  • fapr2511/ds_cs4bd_2324
4 results
Show changes
Commits on Source (31)
Showing
with 2584 additions and 222 deletions
......@@ -131,5 +131,7 @@ dmypy.json
# project-specific files
README_init.md
**/*_sol.py
# solution files and .py files in project directory
**/*_sol.py
/*.py
......@@ -3,7 +3,7 @@
"python.testing.unittestArgs": [
"-v",
"-s",
"./C_expressions",
".",
"-p",
"test_*.py"
],
......
# Assignment A: Setup Python &nbsp; (<span style="color:red">10 Pts</span>)
# Assignment A: Setup Python &nbsp; (10 Pts)
This assignment will setup your base Python enviroment. If you already have it, simply run challenges and answer questions (if any). If you cannot run challenges, set up the needed software.
### Challenges
1. [Challenge 1:](#1-challenge-1-terminal) Terminal
2. [Challenge 2:](#2-challenge-2-python3) Python3
3. [Challenge 3:](#3-challenge-3-pip) pip
4. [Challenge 4:](#4-challenge-4-test-python) Test Python
- [Challenge 1:](#1-challenge-terminal) Terminal
- [Challenge 2:](#2-challenge-python3) Python3
- [Challenge 3:](#3-challenge-pip) pip
- [Challenge 4:](#4-challenge-test-python) Test Python
Points: [4, 2, 2, 2]
&nbsp;
### 1.) Challenge 1: Terminal
### 1.) Challenge: Terminal
If you are using *MacOS* or *Linux*, skip steps for *Windows*.
......@@ -56,12 +58,11 @@ Find out about
- When is `.profile` executed?
- When is `.bashrc` (Mac: `.zshrc`) executed?
(4 Pts)
&nbsp;
### 2.) Challenge 2: Python3
### 2.) Challenge: Python3
Check if you have Python 3 installed on your system. Name three differences between [Python 2 and 3](https://www.guru99.com/python-2-vs-python-3.html#7).
Run commands in terminal (version 3.x.y may vary):
......@@ -77,7 +78,7 @@ PATH variable in `.bashrc` (Mac: `.zshrc`).
&nbsp;
### 3.) Challenge 3: pip
### 3.) Challenge: pip
Check if you have a Python package manager installed (pip, conda, ... ). [`pip`](https://pip.pypa.io) is Python's default package manager to install additional Python packages and libraries.
Follow [instructions](https://pip.pypa.io/en/stable/installing) for installation:
......@@ -95,7 +96,7 @@ packages\pip (python 3.12)
&nbsp;
### 4.) Challenge 4: Test Python
### 4.) Challenge: Test Python
Start Python in the terminal and execute commands:
```py
> python
......
# Assignment B: Explore Python &nbsp; (<span style="color:red">20 Pts</span>)
# Assignment B: Explore Python &nbsp; (20 Pts)
This assignment demonstrates Python's basic data structures.
### Challenges
1. [Challenge 1:](#1-challenge-1-indexing-fruits) Indexing Fruits
2. [Challenge 2:](#2-challenge-2-packaging-fruits) Packaging Fruits
3. [Challenge 3:](#3-challenge-3-sorting-fruits) Sorting Fruits
4. [Challenge 4:](#4-challenge-4-income-analysis) Income Analysis
5. [Challenge 5:](#5-challenge-5-code-income-analysis) Code Income Analysis
6. [Challenge 6:](#6-challenge-6-explore-python-built-in-functions)
- [Challenge 1:](#1-challenge-indexing-fruits) Indexing Fruits
- [Challenge 2:](#2-challenge-packaging-fruits) Packaging Fruits
- [Challenge 3:](#3-challenge-sorting-fruits) Sorting Fruits
- [Challenge 4:](#4-challenge-income-analysis) Income Analysis
- [Challenge 5:](#5-challenge-code-income-analysis) Code Income Analysis
- [Challenge 6:](#6-challenge-explore-python-built-in-functions)
Explore Python built-in functions
Points: [1, 7, 2, 3, 6, 1]
&nbsp;
### 1.) Challenge 1: Indexing Fruits
### 1.) Challenge: Indexing Fruits
Review Python's basic
[data structures](https://www.dataquest.io/blog/data-structures-in-python).
......@@ -41,11 +44,13 @@ the second-last fruit is: banana
>>> print(f"the last three fruits are: {fruits[-3:]}")
the last three fruits are: ['orange', 'banana', 'apple']
```
Perform examples on your laptop. (1 Pt)
Perform examples on your laptop.
(1 Pt)
&nbsp;
### 2.) Challenge 2: Packaging Fruits
### 2.) Challenge: Packaging Fruits
Review Python's built-in
[data structures](https://www.dataquest.io/blog/data-structures-in-python).
......@@ -63,7 +68,7 @@ Perform examples and answer questions on a piece of paper.
>>> fruitbox = ('apple', 'pear', 'orange', 'banana', 'apple')
>>> print(fruits)
['apple', 'pear', 'orange', 'banana']
['apple', 'pear', 'orange', 'banana', 'apple']
>>> print(fruitbox)
('apple', 'pear', 'orange', 'banana', 'apple')
......@@ -80,15 +85,24 @@ Perform examples and answer questions on a piece of paper.
>>>
```
1. How is the structure for Eric called? (1 Pt)
1. How is the structure for `eric1` called? What is the difference to
`eric2`? Explain outputs. (1 Pt)
```py
eric = {"name": "Eric", "salary": 5000, "birthday": "Sep 25 2001"}
eric1 = {"name": "Eric", "salary": 5000, "birthday": "Sep 25 2001"}
>>> print(eric)
eric2 = {"name", "Eric", "salary", 5000, "birthday", "Sep 25 2001"}
>>> print(eric1)
{'name': 'Eric', 'salary': 5000, 'birthday': 'Sep 25 2001'}
>>> print(eric["salary"])
>>> print(eric2)
{'Sep 25 2001', 5000, 'name', 'Eric', 'birthday', 'salary'}
#
# print(eric2) in same order?
# print salary for eric1 and eric2?
>>> print(eric1["salary"])
5000
```
......@@ -107,7 +121,7 @@ in Python. Other people argue that Tuples are closer to Arrays.
&nbsp;
### 3.) Challenge 3: Sorting Fruits
### 3.) Challenge: Sorting Fruits
```py
>>> fruits = ['apple', 'pear', 'orange', 'banana', 'apple']
......@@ -126,11 +140,13 @@ None,
What is the difference between List-function *sort()* and built-in
function *sorted()*, see
[link](https://www.python-engineer.com/posts/sort-vs-sorted) (2 Pts)?
[link](https://www.python-engineer.com/posts/sort-vs-sorted)?
(2 Pts)
&nbsp;
### 4.) Challenge 4: Income Analysis
### 4.) Challenge: Income Analysis
The US tax Income Revenue Service (IRS) annually
publishes income statistics by ZIP codes (postal codes)
([reports](https://www.irs.gov/statistics/soi-tax-stats-individual-income-tax-statistics-2020-zip-code-data-soi)).
......@@ -186,7 +202,7 @@ Answer questions:
&nbsp;
### 5.) Challenge 5: Code Income Analysis
### 5.) Challenge: Code Income Analysis
Write Python code to perform this income analysis for arbitray
ZIP regions.
......@@ -244,15 +260,15 @@ Results:
mean_income in Madera County, CA is: 453,073 - median_income is: 60,714
mean_income in Mountain View, CA is: 1,740,371 - median_income is: 114,820
mean_income in Palo Alto, CA is: 2,077,038 - median_income is: 153,658
mean_income in Atherton, CA is: 2,623,881 - median_income is: 354,087
mean_income in Redding, IA is: 33,333 - median_income is: 31,249
mean_income in New York City, NY U West is: 1,544,990 - median_income is: 104,774
mean_income in Atherton, CA is: 2,623,882 - median_income is: 354,088
mean_income in Redding, IA is: 33,333 - median_income is: 31,250
mean_income in New York City, NY U West is: 1,544,991 - median_income is: 104,775
```
(4 Pts)
&nbsp;
### 6.) Challenge 6: Explore Python built-in functions
### 6.) Challenge: Explore Python built-in functions
Learn about Python's
[built-in functions](https://docs.python.org/3/library/functions.html).
Test the
......
......@@ -45,7 +45,7 @@ def print_analysis(_zip):
)
# attempt to load solution module (ignore)
# attempt to load solution module (if present - ignore)
try:
solution_module = 'income_tax_analysis_sol'
mod = __import__(solution_module, globals(), locals(), [], 0)
......
{
"python.testing.unittestArgs": [
"-v",
"-s",
".",
"-p",
"test_*.py"
],
"python.testing.pytestEnabled": false,
"python.testing.unittestEnabled": true
}
\ No newline at end of file
# Assignment C: Python Expressions & Unit Tests &nbsp; (16 Pts)
This assignment demonstrates Python's powerful (*"one-liner"*) expressions.
### Challenges
- [Challenge 1:](#1-challenge-create-new-project) Create New Project
- [Challenge 2:](#2-challenge-run-code) Run Code
- [Challenge 3:](#3-challenge-run-unit-tests) Run Unit Tests
- [Challenge 4:](#4-challenge-write-expressions) Write Expressions
- [Challenge 5:](#5-challenge-final-test-and-sign-off) Final Test and sign-off
Points: [1, 2, 3, 0, 10]
&nbsp;
### 1.) Challenge: Create New Project
Create a Python project, e.g. named `C_expressions`, and
[pull files](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/tree/main/C_expressions)
from GitLab (above).
Inspect files and figure out their purpose. Write 1-2 sentenses what each file means
and purpose is:
- [__init __.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/__init__.py)
: `_____________________________________`
- What does the init-file contain?
- When and how often is this file executed?
- [expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/expressions.py)
: `__________________________________`
- [test_expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/test_expressions.py)
: `______________________________`
(1 Pt)
&nbsp;
### 2.) Challenge: Run Code
Run file `expressions.py` in your IDE:
```
numbers: [4, 12, 3, 8, 17, 12, 1, 8, 7]
#
a) number of numbers: 9
b) first three numbers: []
c) last three numbers: []
d) last three numbers reverse: []
e) odd numbers: []
f) number of odd numbers: 0
g) sum of odd numbers: 0
h) duplicate numbers removed: []
i) number of duplicate numbers: 0
j) ascending, de-dup (n^2) numbers: []
k) length: NEITHER
[Done] exited with code=0 in 0.126 seconds
```
(1 Pt)
Run file `expressions.py` in terminal:
```sh
cd <project> # cd into project directory
pwd # print working directory
/c/.../workspaces/ds_cs4bd_2324/C_expressions
python expressions.py # run program
-->
numbers: [4, 12, 3, 8, 17, 12, 1, 8, 7]
#
a) number of numbers: 9
b) first three numbers: [4, 12, 3]
c) last three numbers: [1, 8, 7]
d) last three numbers reverse: [7, 8, 1]
e) odd numbers: [3, 17, 1, 7]
f) number of odd numbers: 4
g) sum of odd numbers: 28
h) duplicate numbers removed: [1, 3, 4, 7, 8, 12, 17]
i) number of duplicate numbers: 2
j) ascending, de-dup (n^2) numbers: [1, 9, 16, 49, 64, 144, 289]
k) length: ODD_LIST
```
(1 Pt)
&nbsp;
### 3.) Challenge: Run Unit Tests
Unit Tests are used to *"test-a-unit"* of code in isolation. This unit can be
a function, a file, a class, a module.
In contrast to running code regularly, Unit Tests execute under the
supervision of a `test runner` that:
- looks for (discovers) tested units,
- executes them with test data, collects test results regardless
whether a test succeeded or failed and
- reports test results at the and.
Read *"A Beginner’s Guide to Unit Tests in Python"*,
[link](https://www.dataquest.io/blog/unit-tests-python/),
and answer questions:
- How are tests discovered? Which feature makes the test runner to collect
something as a test?
- What is an
[assert](https://docs.python.org/3/library/unittest.html#assert-methods)
statement? What happens when a test (assert) passes and fails?
- Where is the test runner started in given files?
(1 Pt)
Run tests in a terminal. Currently, only one test runs and passes:
*TestCase_a_number_of_numbers* :
```sh
python test_expressions.py # run tests directly from file calling the
# test runner in __main__
```
Output:
```
test_a_number_of_numbers (C_expressions.test_expressions.TestCase_a_number_of_nu
mbers.test_a_number_of_numbers) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.001s
OK
<unittest.runner.TextTestResult run=1 errors=0 failures=0>
```
Result: 1 test was performed that passed.
Alternatively, run tests with test discovery. Run the unit test module that
starts the test runner, which in turn discovers tests that are then executed:
```sh
python -m unittest # let test runner discover tests
```
Output is the same as above.
(1 Pt)
Configure your IDE so it runs Unit Tests (you can use other IDE than VS Code
that is used here as example).
VSCode discovers unit tests under the test glass icon (red circled).
The figure shows one unit test that has been discovered passing. Unit tests are
structured as *"TestCase - Classes"*, which are classes that inherit from class:
[unittest.TestCase](https://docs.python.org/3/library/unittest.html#unittest.TestCase),
in the example indirectly through class `Test_case_a`.
VSCode shows discovered test classes in the left panel and their execution result
with a green check mark when passed or a red cross when failed.
![](../markup/img/C_unit_tests_1.png)
Uncomment tests: *"Test_case_b"* and *"Test_case_c"* in `test_expressions.py`
above and re-run tests.
Both tests should fail because expressions they test have not been implemented:
![](../markup/img/C_unit_tests_2.png)
Re-run unit tests with the two tests failing in the terminal:
```sh
python -m unittest # let test runner discover tests
```
Output shows one passing and two failed tests:
```
======================================================================
FAIL: test_b_first_three_numbers (test_expressions.TestCase_b_first_three_number
s.test_b_first_three_numbers)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Sven1\svgr\workspaces\ds_cs4bd_2324\C_expressions\test_expressions.py
", line 103, in test_b_first_three_numbers
self.assertEqual(self.ut1.b, [4, 12, 3])
AssertionError: Lists differ: [] != [4, 12, 3]
Second list contains 3 additional elements.
First extra element 0:
4
- []
+ [4, 12, 3]
======================================================================
FAIL: test_c_last_three_numbers (test_expressions.TestCase_c_last_three_numbers.
test_c_last_three_numbers)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Sven1\svgr\workspaces\ds_cs4bd_2324\C_expressions\test_expressions.py
", line 117, in test_c_last_three_numbers
td.assertEqual(td.ut1.c, [1, 8, 7])
AssertionError: Lists differ: [] != [1, 8, 7]
Second list contains 3 additional elements.
First extra element 0:
1
- []
+ [1, 8, 7]
----------------------------------------------------------------------
Ran 3 tests in 0.002s
FAILED (failures=2)
```
Output says: `Ran 3 tests`, `FAILED (failures=2)`.
When tests fail, the test report tells which tests have failed and why:
- *test_b_first_three_numbers* failed in line: 103. The test expected
result: `[4, 12, 3]`, but an empty list `[]` was found in the tested
expression: `self.b` in file `expressions.py`.
- *test_c_last_three_numbers* failed in line: 117 where the test expected
result: `[1, 8, 7]`, but an empty list `[]` was found in: `self.c`
Tests refer to the `self.numbers` list: `[4, 12, 3, 8, 17, 12, 1, 8, 7]`.
(1 Pt)
&nbsp;
### 4.) Challenge: Write Expressions
In order to let tests pass, write expressions in
[expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/expressions.py)
for variables `self.b` .. `self.k` according to specification, e.g. write an
expression for `self.b` that extracts the first three numbers from `self.numbers`.
Use <b>one-line expressions</b> only.
Python's [built-in functions](https://docs.python.org/3/library/functions.html)
are allowed, but not own functions.
Tests exercise expressions with various lists. Initialization with constants
(`self.b = [4, 12, 3]`) will hence not work.
Write expression incrementally, one after the other - not all at once. Some
expressions require thinking and reading.
Once you have written an expression, uncomment the corresponding test case in
[test_expressions.py](https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/blob/main/C_expressions/test_expressions.py):
and re-run the test. See if it is passing or figure out why it is failing
from the test report.
![](../markup/img/C_unit_tests_3.png)
Test cases a), b) and c) are now passing.
Continue until all tests pass.
![](../markup/img/C_unit_tests_4.png)
&nbsp;
### 5.) Challenge: Final Test and sign-off
For sign-off, change into `C_expressions` directory and copy commands into a terminal:
```sh
# Fetch test file from Gitlab and run tests for sign-off.
# The sed-command removes comments from test cases.
test_url=https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/raw/main/C_expressions/test_expressions.py
curl $test_url | \
sed -e 's/^#.*Test_case_/Test_case_/' | \
python
```
Result:
```
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7874 100 7874 0 0 55666 0 --:--:-- --:--:-- --:--:-- 56242
...........
----------------------------------------------------------------------
Ran 11 tests in 0.003s
OK
```
11 tests succeeded.
(10 Pts, 1 Pt for each test passing)
import os
"""
Special file __init__.py marks a directory as a Python package.
A Python Package is a collection of Python modules with an
__init__.py File. The file is executed when the package is imported.
The file is also needed for VS Code test runner to discover tests.
Special file __init__.py marks a directory as a Python module.
The file is executed once when any .py file is imported.
//
Python unittest require the presence of (even an empty) file.
"""
def package_dir(file):
"""
Return name of directory of this package.
"""
path = os.path.normpath(file).split(os.sep)
return path[len(path)-2] # e.g. "C_numbers"
def project_path(file):
"""
Return path to project directory.
"""
path = os.path.normpath(file).split(os.sep)
return os.path.dirname(file)[:-len(PACKAGE_DIR)-1]
def import_sol_module(file):
"""
Import and return module with name "file + '_sol'".
Raises ImportError exception, if _sol file does not exist.
"""
sol_module = (file.split("\\")[-1:])[0].split(".")[0] + "_sol"
return __import__(sol_module, globals(), locals(), [], 0)
# name of this package directory
PACKAGE_DIR = package_dir(__file__)
# path to project directory, in which this module resides
PROJECT_PATH = project_path(__file__)
# load setup module when executed in parent directory
try:
__import__('setup')
#
except ImportError:
pass
from __init__ import import_sol_module
class Expressions:
""""
Class for the assignment. Fill in one-line expressions (no own functions)
to initialize values self.b .. self.k with specified values.
Fill in one-line expressions (no own functions) to initialize attributes
self.b .. self.k with specified values.
Use Python built-in functions, list expressions and list comprehension,
but NOT own functions.
Complete tasks one after another. Once you are done with one task,
uncomment test case in test_numbers.py. Remove comments for
# TestCase_b = Test_Expressions
# TestCase_c = Test_Expressions
# TestCase_d = Test_Expressions
uncomment test cases in test_expressions.py. Remove comments for
# Test_case_b = Test_case
# Test_case_c = Test_case
# Test_case_d = Test_case
# ...
Run tests in IDE and in a terminal:
python test_expressions.py
python -m unittest test_expressions.py
python -m unittest
"""
default_numbers=[4, 12, 3, 8, 17, 12, 1, 8, 7]
......@@ -26,11 +24,10 @@ class Expressions:
"""
Constructor to initialize member variables.
"""
# initialize numbers list
self.numbers = _numbers
# a) initialize with number of numbers: 9
self.a = len(self.numbers) # <-- insert expression for a), solution is given here
self.a = len(self.numbers) # <-- given solution, insert one-line expressions below
# b) initialize with first three numbers: [4, 12, 3]
self.b = [] # <-- write expression here
......@@ -64,9 +61,9 @@ class Expressions:
# attempt to load solution module (ignore)
try:
mod = import_sol_module(__file__)
mod.set_solution(self) # replace empty values with solutions
# print(f'solution module found: {solution_module}.py')
_from, _import = 'expressions_sol', 'Stream'
mod = __import__(_from, fromlist=[_import])
mod.set_solution(self) # invoke set_solution() to replace values with solutions
#
except ImportError:
pass
......@@ -95,11 +92,12 @@ class Expressions:
if __name__ == '__main__':
'''
Driver that runs when this file is directly executed.
Driver code that runs when this file is directly executed.
'''
#
n1 = Expressions() # use default list
n1 = Expressions() # use default list
#
# 2nd object with different list
n2 = Expressions([1, 4, 6, 67, 6, 8, 23, 8, 34, 49, 67,
6, 8, 23, 37, 67, 6, 34, 19, 67, 6, 8])
#
......
Run unit tests with discovery (-m) or from __main__() with verbosity level 2
- python -m unittest test_numbers.py
- python test_numbers.py
Output with verbosity level < 2:
================================
...........
----------------------------------------------------------------------
Ran 11 tests in 0.002s
OK
Output with verbosity level >=2:
================================
test_a_number_of_numbers (C_expressions.test_expressions.TestCase_a_number_of_numbers.test_a_number_of_numbers) ... ok
test_b_first_three_numbers (C_expressions.test_expressions.TestCase_b_first_three_numbers.test_b_first_three_numbers) ... ok
test_c_last_three_numbers (C_expressions.test_expressions.TestCase_c_last_three_numbers.test_c_last_three_numbers) ... ok
test_d_last_threeClass_in_reverse (C_expressions.test_expressions.TestCase_d_last_threeClass_in_reverse.test_d_last_threeClass_in_reverse) ... ok
test_e_odd_numbers (C_expressions.test_expressions.TestCase_e_odd_numbers.test_e_odd_numbers) ... ok
test_f_number_of_odd_numbers (C_expressions.test_expressions.TestCase_f_number_of_odd_numbers.test_f_number_of_odd_numbers) ... ok
test_g_sum_of_odd_numbers (C_expressions.test_expressions.TestCase_g_sum_of_odd_numbers.test_g_sum_of_odd_numbers) ... ok
test_h_duplicateClass_removed (C_expressions.test_expressions.TestCase_h_duplicateClass_removed.test_h_duplicateClass_removed) ... ok
test_i_number_of_duplicate_numbers (C_expressions.test_expressions.TestCase_i_number_of_duplicate_numbers.test_i_number_of_duplicate_numbers) ... ok
test_j_ascending_squaredClass_no_duplicates (C_expressions.test_expressions.TestCase_j_ascending_squaredClass_no_duplicates.test_j_ascending_squaredClass_no_duplicates) ... ok
test_k_classifyClass_as_odd_even_empty (C_expressions.test_expressions.TestCase_k_classifyClass_as_odd_even_empty.test_k_classifyClass_as_odd_even_empty) ... ok
----------------------------------------------------------------------
Ran 11 tests in 0.005s
OK
<unittest.runner.TextTestResult run=11 errors=0 failures=0>
"""
Run unit tests with discovery (-m) or from __main__() with verbosity level 2
- python -m unittest test_numbers.py
- python test_numbers.py
- python -m unittest
- python test_expressions.py
Output with verbosity level < 2:
================================
......@@ -12,84 +12,64 @@ OK
<unittest.runner.TextTestResult run=11 errors=0 failures=0>
"""
import unittest
import abc # import Abstract Base Class (ABC) from module abc
from expressions import Expressions
from expressions import __file__ as numbers__file__
from __init__ import PACKAGE_DIR, PROJECT_PATH, import_sol_module
class TestCase_test_data:
"""
Class with test data (objects under test, instances of class Numbers: ut1...)
"""
# objects "under test" or "tested objects" are instances
# of class Numbers initialized with varying lists
ut1 = Expressions(Expressions.default_numbers) # [4, 12, 3, 8, 17, 12, 1, 8, 7]
ut2 = Expressions([1, 4, 6, 67, 6, 8, 23, 8, 34, 49, 67, 6, 8, 23, 37, 67, 6, 34, 19, 67, 6, 8])
ut3 = Expressions([6, 67, 6, 8, 17, 3, 6, 8])
ut4 = Expressions([8, 3, 9])
ut5 = Expressions([1, 1, 1])
ut6 = Expressions([0, 0])
ut7 = Expressions([0])
ut8 = Expressions([])
class Test_Expressions(unittest.TestCase):
"""
tested objects (objects "under test", "ut") as instances of the Expressions class
"""
ut1 = Expressions(Expressions.default_numbers) # [4, 12, 3, 8, 17, 12, 1, 8, 7]
ut2 = Expressions([1, 4, 6, 67, 6, 8, 23, 8, 34, 49, 67, 6, 8, 23, 37, 67, 6, 34, 19, 67, 6, 8])
ut3 = Expressions([6, 67, 6, 8, 17, 3, 6, 8])
ut4 = Expressions([8, 3, 9])
ut5 = Expressions([1, 1, 1])
ut6 = Expressions([0, 0])
ut7 = Expressions([0])
ut8 = Expressions([])
class Test_case(unittest.TestCase):
"""
Top-level class that inherits from class unittest.TestCase
and injects test data into derived test classes.
Sub-classes of unittest.TestCase are discovered as unit tests.
and injects test data into derived classes for test cases.
Sub-classes are discovered as unit tests.
"""
def setUp(self):
# TestCase_test_data.inject_test_data_into(self)
td_ = TestCase_test_data
self.ut1 = td_.ut1
self.ut2 = td_.ut2
self.ut3 = td_.ut3
self.ut4 = td_.ut4
self.ut5 = td_.ut5
self.ut6 = td_.ut6
self.ut7 = td_.ut7
self.ut8 = td_.ut8
class Disabled_test:
"""
Class does not inherit from unittest.TestCase and
is hence ignored by test discovery.
"""
pass
try:
mod = import_sol_module(numbers__file__)
verbosity_level = 1
Test_class = Test_Expressions
#
except ImportError:
verbosity_level = 2
Test_class = Disabled_test
# initialize test cases as tests (Test_Numbers) or disabled
TestCase_a = TestCase_b = TestCase_c = TestCase_d = Test_class
TestCase_e = TestCase_f = TestCase_g = TestCase_h = Test_class
TestCase_i = TestCase_j = TestCase_k = Test_class
# uncomment tests, one after the other as you progress from b) through k)
TestCase_a = Test_Expressions # test a) passes, solution is given in numbers.py
TestCase_b = Test_Expressions
TestCase_c = Test_Expressions
TestCase_d = Test_Expressions
TestCase_e = Test_Expressions
TestCase_f = Test_Expressions
TestCase_g = Test_Expressions
TestCase_h = Test_Expressions
TestCase_i = Test_Expressions
TestCase_j = Test_Expressions
TestCase_k = Test_Expressions
class TestCase_a_number_of_numbers(TestCase_a):
self.ut1 = ut1
self.ut2 = ut2
self.ut3 = ut3
self.ut4 = ut4
self.ut5 = ut5
self.ut6 = ut6
self.ut7 = ut7
self.ut8 = ut8
# disable tests by assigning Python's Abstract Base Class (ABC) to test
# case classes, which will not be discovered as unit tests
Test_case_a = Test_case_b = Test_case_c = Test_case_d = \
Test_case_e = Test_case_f = Test_case_g = Test_case_h = \
Test_case_i = Test_case_j = Test_case_k = abc.ABC
# assign Test_case class (above) as subclass of unittest.TestCase and with
# attributes of tested objects (self.ut1...ut8)
# uncomment tests one after another as you progress with expressions
Test_case_a = Test_case # test a) passes, solution is given in numbers.py
# Test_case_b = Test_case
# Test_case_c = Test_case
# Test_case_d = Test_case
# Test_case_e = Test_case
# Test_case_f = Test_case
# Test_case_g = Test_case
# Test_case_h = Test_case
# Test_case_i = Test_case
# Test_case_j = Test_case
# Test_case_k = Test_case
class TestCase_a_number_of_numbers(Test_case_a):
#
# tests a): number of numbers tests (lengths of numbers lists)
def test_a_number_of_numbers(self):
......@@ -103,7 +83,7 @@ class TestCase_a_number_of_numbers(TestCase_a):
self.assertEqual(self.ut8.a, 0)
class TestCase_b_first_three_numbers(TestCase_b):
class TestCase_b_first_three_numbers(Test_case_b):
#
# tests b): first three numbers
def test_b_first_three_numbers(self):
......@@ -117,7 +97,7 @@ class TestCase_b_first_three_numbers(TestCase_b):
self.assertEqual(self.ut8.b, [])
class TestCase_c_last_three_numbers(TestCase_c):
class TestCase_c_last_three_numbers(Test_case_c):
#
# tests c): last three numbers
def test_c_last_three_numbers(td):
......@@ -131,7 +111,7 @@ class TestCase_c_last_three_numbers(TestCase_c):
td.assertEqual(td.ut8.c, [])
class TestCase_d_last_threeClass_in_reverse(TestCase_d):
class TestCase_d_last_threeClass_in_reverse(Test_case_d):
#
# tests d): last three numbers in reverse
def test_d_last_threeClass_in_reverse(td):
......@@ -145,9 +125,9 @@ class TestCase_d_last_threeClass_in_reverse(TestCase_d):
td.assertEqual(td.ut8.d, [])
class TestCase_e_odd_numbers(TestCase_e):
class TestCase_e_odd_numbers(Test_case_e):
#
# tests e): odd numbers
# tests e): odd numbers, order must be preserved
def test_e_odd_numbers(td):
td.assertEqual(td.ut1.e, [3, 17, 1, 7])
td.assertEqual(td.ut2.e, [1, 67, 23, 49, 67, 23, 37, 67, 19, 67])
......@@ -159,7 +139,7 @@ class TestCase_e_odd_numbers(TestCase_e):
td.assertEqual(td.ut8.e, [])
class TestCase_f_number_of_odd_numbers(TestCase_f):
class TestCase_f_number_of_odd_numbers(Test_case_f):
#
# tests f): number of odd numbers
def test_f_number_of_odd_numbers(td):
......@@ -173,7 +153,7 @@ class TestCase_f_number_of_odd_numbers(TestCase_f):
td.assertEqual(td.ut8.f, 0)
class TestCase_g_sum_of_odd_numbers(TestCase_g):
class TestCase_g_sum_of_odd_numbers(Test_case_g):
#
# tests g): sum of odd numbers
def test_g_sum_of_odd_numbers(td):
......@@ -187,21 +167,21 @@ class TestCase_g_sum_of_odd_numbers(TestCase_g):
td.assertEqual(td.ut8.g, 0)
class TestCase_h_duplicateClass_removed(TestCase_h):
class TestCase_h_duplicateClass_removed(Test_case_h):
#
# tests h): duplicate numbers removed
# tests h): duplicate numbers removed - use set() to accept any order
def test_h_duplicateClass_removed(td):
td.assertEqual(td.ut1.h, [4, 12, 3, 8, 17, 1, 7])
td.assertEqual(td.ut2.h, [1, 4, 6, 67, 8, 23, 34, 49, 37, 19])
td.assertEqual(td.ut3.h, [6, 67, 8, 17, 3])
td.assertEqual(td.ut4.h, [8, 3, 9])
td.assertEqual(set(td.ut1.h), {4, 12, 3, 8, 17, 1, 7})
td.assertEqual(set(td.ut2.h), {1, 4, 6, 67, 8, 23, 34, 49, 37, 19})
td.assertEqual(set(td.ut3.h), {6, 67, 8, 17, 3})
td.assertEqual(set(td.ut4.h), {8, 3, 9})
td.assertEqual(td.ut5.h, [1])
td.assertEqual(td.ut6.h, [0])
td.assertEqual(td.ut7.h, [0])
td.assertEqual(td.ut8.h, [])
class TestCase_i_number_of_duplicate_numbers(TestCase_i):
class TestCase_i_number_of_duplicate_numbers(Test_case_i):
#
# tests i): number of duplicate numbers
def test_i_number_of_duplicate_numbers(td):
......@@ -215,21 +195,21 @@ class TestCase_i_number_of_duplicate_numbers(TestCase_i):
td.assertEqual(td.ut8.i, 0)
class TestCase_j_ascending_squaredClass_no_duplicates(TestCase_j):
class TestCase_j_ascending_squaredClass_no_duplicates(Test_case_j):
#
# tests j): ascending list of squared numbers with no duplicates
def test_j_ascending_squaredClass_no_duplicates(td):
td.assertEqual(td.ut1.j, [1, 9, 16, 49, 64, 144, 289])
td.assertEqual(td.ut2.j, [1, 16, 36, 64, 361, 529, 1156, 1369, 2401, 4489])
td.assertEqual(td.ut3.j, [9, 36, 64, 289, 4489])
td.assertEqual(td.ut4.j, [9, 64, 81])
td.assertEqual(set(td.ut1.j), {1, 9, 16, 49, 64, 144, 289})
td.assertEqual(set(td.ut2.j), {1, 16, 36, 64, 361, 529, 1156, 1369, 2401, 4489})
td.assertEqual(set(td.ut3.j), {9, 36, 64, 289, 4489})
td.assertEqual(set(td.ut4.j), {9, 64, 81})
td.assertEqual(td.ut5.j, [1])
td.assertEqual(td.ut6.j, [0])
td.assertEqual(td.ut7.j, [0])
td.assertEqual(td.ut8.j, [])
class TestCase_k_classifyClass_as_odd_even_empty(TestCase_k):
class TestCase_k_classifyClass_as_odd_even_empty(Test_case_k):
#
# tests k): classify as "ODD_LIST", "EVEN_LIST" or "EMPTY_LIST" depending on numbers length
def test_k_classifyClass_as_odd_even_empty(td):
......@@ -244,12 +224,4 @@ class TestCase_k_classifyClass_as_odd_even_empty(TestCase_k):
if __name__ == '__main__':
#
# discover tests in this package
test_classes = unittest.defaultTestLoader \
.discover(PACKAGE_DIR, pattern='test_*.py', top_level_dir=PROJECT_PATH)
#
suite = unittest.TestSuite(test_classes)
runner = unittest.runner.TextTestRunner(verbosity=verbosity_level)
result = runner.run(suite)
print(result)
unittest.main()
# Assignment D: Recursive Problem Solving &nbsp; (15 Pts + 4 Extra Pts)
Recursion is not just a *"function calling itself"*, it is a way of thinking
about a class of problems that can be split into simple "*base cases"* and
remaining *"sub-problems"* that are *"self-similar"*.
A *"sub-problem"* is self-similar when it exactly looks the same as the
original problem, just smaller (e.g. reduced by one element). At some point,
the simple "*base case"* has been reached that yields a primitive solution.
A recursive *solution function* exploiting self-similarity has two phases:
1. *Reduction:* - slicing the problem (e.g. a list of numbers) into one
element (e.g. the first number) and a remaining *sub-problem*
(e.g. the list of remaining numbers).
1. *Recursion:* - invoke the same function for the *sub-problem*
until the *sub-problem* has been reduced to the *base case*.
Return the solution for the *base case*.
1. *Construction:* - results of recursive invocations are considered
as solutions of *sub-problems* and are combined with the element
that was isolated at the particular level of recursion.
While this approach is elegant from a thinking-about-problems and programming
point of view, it has cost associated for using the
[Callstack](https://en.wikipedia.org/wiki/Call_stack)
using a data structure of an abstract data type
[Stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))
for recursions.
### Challenges
- [Challenge 1:](#1-challenge-simple-recursion-sum-numbers) Simple recursion: *sum* numbers
- [Challenge 2:](#2-challenge-fibonacci-numbers) Fibonacci numbers
- [Challenge 3:](#3-challenge-permutation) Permutation
- [Challenge 4:](#4-challenge-powerset) Powerset
- [Challenge 5:](#5-challenge-find-matching-pairs) Find Matching Pairs
- [Challenge 6:](#6-challenge-combinatorial-problem-of-finding-numbers) Combinatorial Problem of Finding Numbers
- [Challenge 7:](#7-challenge-hard-problem-of-finding-numbers) Hard Problem of Finding Numbers
Points: [2, 1, 2, 2, 2, 3, 2, +4 extra pts]
File [recursion.py](recursion.py) has function headers defined for each challenge.
Use those functions and complete code.
&nbsp;
### 1.) Challenge: Simple recursion: *sum()* numbers
Computing the *sum* of numbers is most often performed *iteratively* as a loop
over given numbers and adding them in a result variable.
Solving the problem *recursively* illustrates the concept of self-similarity
and recursive problem solving.
Use the following approach:
1. *Reduction:* - split the given list of numbers into a first element (first number)
and a list of remaining numbers (*sub-problem*). Remember the first element.
1. *Recursion:* - invoke *sum()* for the list of remaining numbers until the base case
has been reached: *sum( [ ] ) = 0* or *sum( [n] )=n*.
1. *Construction:* - add the remembered element to the value returned from the
recursive invocation of *sum()*.
Complete: `sum(_numbers)` using this approach:
```py
def sum(self, _numbers) -> int:
# your code
return #...
```
Remove comment from `run_choices` and run the program:
```py
run_choices = [
1, # Challenge 1, Simple recursion: sum numbers
...
]
```
Output:
```
n1.numbers: [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8]
sum(n1.numbers): 74
```
Answer questions:
1. How many time is the *"first element"* stored?
How much memory is used for applying the function to a list of *n* numbers?
1. What is the run-time estimate for *sum()* given a list of *n* numbers?
1. How many *stack-frames* are used for a list of *n* numbers?
(2 Pts)
&nbsp;
### 2.) Challenge: Fibonacci numbers
[Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_number) were first
described in Indian mathematics as early as 200 BC in works by *Pingala* on
enumerating possible patterns of Sanskrit poetry formed from syllables of two lengths.
Italian mathematician *Leonardo of Pisa*, later known as
*[Fibonacci](https://en.wikipedia.org/wiki/Fibonacci)*,
introduced the sequence to Western European mathematics in his 1202 book
*[Liber Abaci](https://en.wikipedia.org/wiki/Liber_Abaci)*.
Numbers of the *Fibonacci sequence* are defined as: *fib(0): 0*, *fib(1): 1*, *...*
and each following number is the sum of the two preceding numbers.
Fibonacci numbers are widely found in *nature*, *science*, *social behaviors* of
populations and *arts*, e.g. they form the basis of the
[Golden Ratio](https://www.adobe.com/creativecloud/design/discover/golden-ratio.html),
which is widely used in *painting* and *photography*, see also this
[1:32min](https://www.youtube.com/watch?v=v6PTrc0z4w4) video.
<img src="../markup/img/fibonacci.jpg" alt="drawing" width="640"/>
<!-- ![image](../markup/img/fibonacci.jpg) -->
&nbsp;
Complete functions `fib(n)` and `fib_gen(n)`.
```py
def fib(self, _n) -> int:
# return value of n-th Fibonacci number
return #...
def fib_gen(self, _n):
# return a generator object that yields two lists, one with n and the
# other with corresponding fib(n)
yield #...
```
Remove comment from `run_choices` and run the program:
```py
run_choices = [
1, # Challenge 1, Simple recursion: sum numbers
2, # Challenge 2, Fibonacci numbers
...
]
```
Output:
```
n: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
fib(n): [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
```
Answer questions:
1. Explain the concept of a generator in Python.
1. Why can't `fib(60)` or `fib(90)` be computed recursively?
1. What is the more limiting constraint: memory use or needed run time?
```py
n = 30
print(f'fib({n}): {n1.fib(n)}')
n = 60
print(f'fib({n}): {n1.fib(n)}') # ??
n = 90
print(f'fib({n}): {n1.fib(n)}') # ??
```
Understand the problem and use a technique called
[memoization](https://stackoverflow.com/questions/7875380/recursive-fibonacci-memoization)
to make the solution work for *n=60* and *n=90* - still recursively (!).
Remove comments `#21` and `#22` from `run_choices` and run the program:
Output:
```
fib(30): 832040
fib(60): 1548008755920
fib(90): 2880067194370816120
```
(2 Pts)
&nbsp;
### 3.) Challenge: Permutation
[Permutation](https://en.wikipedia.org/wiki/Permutation) is a list of all
arrangements of elements.
For example:
```py
perm([1, 2, 3]) -> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
perm([]) -> [[]]
perm([1]) -> [[1]]
perm([1, 2]) -> [[1, 2], [2, 1]]
perm([1, 2, 3]) -> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
perm([1, 2, 3, 4]) -> [[1, 2, 3, 4], [1, 2, 4, 3], ... [4, 3, 1, 2], [4, 3, 2, 1]]
```
Find a pattern how numbers are arranged for `perm([1, 2])` and `perm([1, 2, 3])`
and adapt it for `perm([1, 2, 3, 4])` to understand the algorithm.
Writing non-recursive code for that algorithm can be difficult, but it fits
well with the recursive sub-problen approach, which is elegant with the
four steps:
1. Return solutions for trivial cases: `[]`, `[1]`, `[1, 2]`.
1. Split the problem by removing the first number `n1` from the list leaving `r` as
remaining list (sub-problem).
1. Invoke `perm(r)` recursively on the remaining list.
1. Combine the result returned from `perm(r)` by adding `n1` to each element.
```py
def perm(self, _numbers) -> list:
res=[] # collect result
# code...
# 1. Return solutions for trivial cases: `[]`, `[1]`, `[1, 2]`.
# 2. Split the problem by removing the first number `n1` from the list
# leaving `r` as remaining list (sub-problem).
# 3. Invoke `perm(r)` recursively on the remaining list.
# 4. Combine the result by adding `n1` to each returned element from `perm(r)`.
#
return res
lst = [1, 2, 3]
perm = n1.perm(lst)
print(f'perm({lst}) -> {perm}')
lst = [1, 2, 3, 4]
perm = n1.perm(lst)
print(f'perm({lst}) -> {perm}')
```
Remove comment `#3` from `run_choices` and run the program:
Output:
```
perm([1, 2, 3]) -> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
perm([1, 2, 3, 4]) -> [[1, 2, 3, 4], [1, 2, 4, 3], ... [4, 3, 1, 2], [4, 3, 2, 1]]
```
Answer questions:
- With a rising length of the input list, how does the number of permutations grow?
(2 Pts)
&nbsp;
### 4.) Challenge: Powerset
[Powerset](https://en.wikipedia.org/wiki/Powerset) is a list of all
subsets of elements including the empty set.
For example:
```py
pset([1, 2, 3]) -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
```
Undertstand the pattern and complete function `pset()`.
```py
def pset(self, _numbers) -> list:
res=[] # collect result
# code...
# 1. Return solutions for trivial cases: `[]`, `[1]`, `[1, 2]`.
# 2. Split the problem by removing the first number `n1` from the list
# leaving `r` as remaining list (sub-problem).
# 3. Invoke `pset(r)` recursively on the remaining list.
# 4. Combine the result with the first element.
#
return res
lst = [1, 2, 3]
pset = n1.pset(lst)
print(f'pset({lst}) -> {pset}')
```
Remove comment `#4` from `run_choices` and run the program:
Output:
```py
pset([1, 2, 3]) -> [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
```
Answer questions:
- With a rising length of the input list, how does the size of the Powerset grow?
(2 Pts)
&nbsp;
### 5.) Challenge: Find Matching Pairs
Write three functions to `find` elements in a list.
The first function to `find` elements that match a boolean `match_func`.
A second function `find_adjacent` that finds all indexes of adjacent pairs
of numbers.
The third function `find_pairs` that finds all pairs of numbers (not necessarily
adjacent) with the sum equal to `n`. For example, `n=12` can be combined from
the input list with pairs: `[3, 9], [4, 8], [2, 10]`.
```py
def find(self, _numbers, match_func) -> list:
res = [] # code...
return res
def find_adjacent(self, pair, _numbers) -> list:
res = [] # code...
return res
def find_pairs(self, n, _numbers) -> list:
res = [] # code...
return res
lst = [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8] # input list
#
div3 = n1.find(lst, match_func=lambda n : n % 3 == 0)
print(f'find numbers divisible by 3: {div3}')
#
p = [4, 8] # find all indexes of adjacent numbers [4, 8]
adj = n1.find_adjacent(p, lst)
print(f'find_adjacent({p}, list): {adj}')
#
n = 12 # find all pairs from the input list that add to n
pairs = n1.find_pairs(n, lst)
print(f'find_pairs({n}, list) -> {pairs}')
```
Remove comments `#5`, `#51` and `#52` from `run_choices` and run the program:
Output:
```
find numbers divisible by 3: [9, 3]
find_adjacent([4, 8], list): [1, 5, 9]
find_pairs(12, list) -> [[3, 9], [4, 8], [2, 10]]
```
Answer questions:
- With a rising length of the input list, how many steps are needed to
complete each function in the best and worst case and on average?
| function | answers |
| ---------------- | ----------- |
| `find` | best case: ______, worst case: ______, average: ______ steps. |
| `find_adjacent` | best case: ______, worst case: ______, average: ______ steps. |
| `find_pairs` | best case: ______, worst case: ______, average: ______ steps. |
(3 Pts)
&nbsp;
### 6.) Challenge: Combinatorial Problem of Finding Numbers
`find_all_sums` is a function that returns any combination of numbers from the
input list that add to `n`. For example, `n=14` can be combined from an
input list: `[8, 10, 2, 14, 4]` by combinations: `[4, 8, 2], [4, 10], [14]`.
A first approach to the problem is to understand the nature of possible
combinations from the input list. If all those combinations could be
generated, each could be tested whether their elements add to `n` and if,
collect them for the final result.
The order of numbers in solutions is not relevant (summation is commutative).
Duplicate solutions with same numbers, but in different order need be to removed.
```py
def find_all_sums(self, n, _numbers) -> list:
res = [] # code...
return res
lst = [8, 10, 2, 14, 4] # input list
n = 14
all = n1.find_all_sums(n, lst)
print(f'find_all_sums({n}, lst) -> {all}')
```
Output:
```py
find_all_sums(14, lst) -> [[4, 8, 2], [4, 10], [14]]
```
Test your solution with a larger input set:
```py
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
]
n = 469
all = n1.find_all_sums(n, lst)
print(f'find_all_sums({n}, lst) -> {all}')
```
Remove comments `#6`, and `#61` from `run_choices` and run the program:
Output:
```
find_all_sums(469, lst) -> [[179, 290], [101, 167, 201]]
```
Answer questions:
- With a rising length of the input list, how does the number of possible
solutuions rise that must be tested?
(2 Pts)
&nbsp;
### 7.) Challenge: Hard Problem of Finding Numbers
Larger data sets can no longer be solved *"brute force"* by exploring all possible
2^n combinations.
Find a solution using a recursive approach exploring a decision tree or
with tabulation.
```py
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
132, 135, 584, 432, 591, 204, 417, 405, 362, 658,
136, 751, 583, 536, 293, 493, 431, 780, 563, 703,
400, 618, 397, 320, 513, 708, 319, 317, 685, 347,
758, 439, 145, 378, 158, 384, 551, 110, 408, 648,
847, 498, 50, 19, # 64 numbers
]
n = 469
all = n1.find_all_sums(n, lst)
for i, s in enumerate(all):
print(f' - {i+1:2}: sum({sum(s)}) -> {s}')
```
Sort output by lenght of solution (use length as primary and numeric value
of first element as secondary criteria).
Remove comment `#7` if you tackled the challenge and run the program:
Output:
```
1: sum(469) -> [290, 179]
2: sum(469) -> [19, 157, 293]
3: sum(469) -> [19, 246, 204]
4: sum(469) -> [19, 318, 132]
5: sum(469) -> [19, 400, 50]
6: sum(469) -> [50, 101, 318]
7: sum(469) -> [110, 201, 158]
8: sum(469) -> [136, 201, 132]
9: sum(469) -> [145, 167, 157]
10: sum(469) -> [158, 179, 132]
11: sum(469) -> [201, 101, 167]
12: sum(469) -> [19, 101, 204, 145]
13: sum(469) -> [19, 157, 135, 158]
14: sum(469) -> [19, 179, 135, 136]
15: sum(469) -> [19, 204, 136, 110]
16: sum(469) -> [19, 290, 110, 50]
17: sum(469) -> [19, 101, 167, 132, 50]
18: sum(469) -> [19, 132, 158, 110, 50]
```
( +4 Extra Pts)
from functools import cmp_to_key
"""
Assignment_D: recursion
"""
class Recursion:
numbers = [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8] #[4, 12, 3, 8, 17, 12, 1, 3, 8, 7]
def sum(self, _numbers) -> int:
"""
Return sum of numbers using recursion.
Follow steps:
1. Return 0 for an empty list of numbers.
2. Split the problem by removing the first number `n1` from the list leaving `r` as remaining list (sub-problem).
3. Invoke `sum(r)` recursively on the remaining list.
4. Combine the result for the sub-problem with the first number `n1`: `return n1 + sum(r)`.
"""
# your code
return 0
def fib(self, _n, memo=None) -> int:
"""
Return value of n-th Fibonacci number.
- input: n=8
- output: 21
"""
# your code
return 0
def fib_gen(self, _n):
"""
Return a generator object that yields two lists, one with n and the
other with corresponding fib(n).
- input: n=16
- output: generator object that produces:
([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987])
"""
# your code
yield ([], [])
def perm(self, _numbers) -> list:
"""
Return permutation (all possible arrangements) for a given input list.
- input: [1, 2, 3]
- output: [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
"""
# your code
return []
def pset(self, _numbers) -> list:
"""
Return powerset (set of all subsets) for a given input list.
- input: [1, 2, 3]
- output: powerset, [[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]
"""
# your code
return []
def find(self, _numbers, match_func) -> list:
"""
Return list of elements n for which match_func(n) evaluates True.
"""
# your code
return []
def find_adjacent(self, pair, _numbers, _i=0) -> list:
"""
Return list of indexes of adjacent numbers in _numbers.
"""
# your code
return []
def find_pairs(self, n, _numbers) -> list:
"""
Return list of pairs from _numbers that add to n,
any pair, any order, no duplicates.
"""
# your code
return []
def find_all_sums(self, n, _numbers) -> list:
"""
Return all combinations of numbers in _numbers that add to n,
(any pair, any order, no duplicates).
"""
# your code
return []
def __init__(self, _numbers=numbers):
"""
Constructor to initialize member variables.
"""
self.numbers = _numbers
run_choices = [
1, # Challenge 1, Simple recursion: sum numbers
2, # Challenge 2, Fibonacci numbers
# 21, # Challenge 2.1, fig_gen()
# 22, # Challenge 2.2, memoization, fib(60), fib(90)
# 3, # Challenge 3, Permutation
# 4, # Challenge 4, Powerset
# 5, # Challenge 5, Finding matches, find()
# 51, # Challenge 5.1, find_adjacent() pairs
# 52, # Challenge 5.2, find_pairs() that add to n
# 6, # Challenge 6, Find all combinations that add to n
# 61, # Challenge 6.1, Find all in medium set
# 7 # Challenge 7, Hard problem finding numbers (extra points)
]
# Ignore this code that loads solution from file, if exists.
# The solution is not distributed.
try:
_from, _import = 'recursion_sol', 'Recursion'
Recursion = getattr(__import__(_from, fromlist=[_import]), _import)
#
except ImportError:
pass
if __name__ == '__main__':
"""
Main driver that runs when this file is executed by Python interpreter.
"""
run_choices = Recursion.run_choices
numbers = [9, 4, 8, 10, 2, 4, 8, 3, 14, 4, 8]
n1 = Recursion(numbers)
print(f'n1.numbers: {n1.numbers}')
# Challenge 1, Simple recursion: sum numbers
if 1 in run_choices:
s = n1.sum(n1.numbers)
print(f'sum(n1.numbers): {s}')
# Challenge 2, Fibonacci numbers
if 2 in run_choices:
n = 30
print(f'\nfib({n}): {n1.fib(n)}')
# Challenge 2.1, fig_gen()
if 21 in run_choices:
gen = n1.fib_gen(20) # yield generator object
n, fib = next(gen) # trigger generator
print(f'n: {n}')
print(f'fib(n): {fib}')
# Challenge 2.2, memoization, fib(60), fib(90)
if 22 in run_choices:
n = 60
print(f'fib({n}): {n1.fib(n)}') # ??
n = 90
print(f'fib({n}): {n1.fib(n)}') # ??
# Challenge 3, Permutation
if 3 in run_choices:
lst = [1, 2, 3]
perm = n1.perm(lst)
print(f'\nperm({lst}) -> {perm}')
# Challenge 4, Powerset
if 4 in run_choices:
lst = [1, 2, 3]
pset = n1.pset(lst)
print(f'\npset({lst}) -> {pset}')
lst = n1.numbers
#
# Challenge 5, Finding matches, find()
if 5 in run_choices:
div3 = n1.find(lst, match_func=lambda n : n % 3 == 0)
print(f'\nfind numbers divisible by 3: {div3}')
# Challenge 5.1, find_adjacent() pairs
if 51 in run_choices:
pair = [4, 8]
adj = n1.find_adjacent(pair, lst)
print(f'find_adjacent({pair}, list): {adj}')
# Challenge 5.2, find_pairs() that add to n
if 52 in run_choices:
n = 12
pairs = n1.find_pairs(n, lst)
print(f'find_pairs({n}, list) -> {pairs}')
lst = [8, 10, 2, 14, 4] # input list
#
# Challenge 6, Find all combinations that add to n
if 6 in run_choices:
print(f'\nlist: {lst}\n\\\\')
n = 14
all = n1.find_all_sums(n, lst)
print(f'find_all_sums({n}, lst) -> {all}')
#
n = 20
all = n1.find_all_sums(n, lst)
print(f' - find_all_sums({n}, lst) -> {all}')
#
n = 32
all = n1.find_all_sums(n, lst)
print(f' - find_all_sums({n}, lst) -> {all}')
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
]
#
# Challenge 6.1, Find all in medium set
if 61 in run_choices:
print(f'\nlist({len(lst)}): {lst}\n\\\\')
n = 101 + 201 + 167 # 469 -> [[179, 290], [101, 167, 201]]
all = n1.find_all_sums(n, lst)
for i, s in enumerate(all):
print(f' {i+1:2}: find_all_sums({sum(s)}) -> {s}')
lst = [ # input list
260, 720, 225, 179, 101, 767, 167, 200, 157, 289,
318, 303, 153, 290, 201, 594, 457, 607, 592, 246,
132, 135, 584, 432, 591, 204, 417, 405, 362, 658,
136, 751, 583, 536, 293, 493, 431, 780, 563, 703,
400, 618, 397, 320, 513, 708, 319, 317, 685, 347,
758, 439, 145, 378, 158, 384, 551, 110, 408, 648,
847, 498, 50, 19, # 64 numbers
]
# Challenge 7, Hard problem finding numbers (extra points)
if 7 in run_choices:
print(f'\nlist({len(lst)}) with {len(lst)} numbers.\n\\\\')
n = 101 + 201 + 167 # 469
all = n1.find_all_sums(n, lst)
#
sort_cpm = lambda x, y: -1 if len(x) < len(y) else 1 if len(x) > len(y) else \
-1 if x <= y else 1 if x > y else 0
all.sort(key=cmp_to_key(sort_cpm)) # sort by len(solution)
#
for i, s in enumerate(all):
print(f' {i+1:2}: find_all_sums({sum(s)}) -> {s}')
print()
# #
# n = 899 # 720 + 179, [[720, 179], [260, 179, 157, 303], [167, 289, 153, 290], [289, 153, 457]]
# n = 6240
\ No newline at end of file
{
"python.testing.unittestArgs": [
"-v",
"-s",
".",
"-p",
"test_*.py"
],
"python.testing.pytestEnabled": false,
"python.testing.unittestEnabled": true
}
\ No newline at end of file
# Assignment E: Data Streams in Python &nbsp; (12 Pts)
### Challenges
- [Challenge 1:](#1-challenge-data-streams-in-python) Data Streams in Python
- [Challenge 2:](#2-challenge-map-function) *map()* function
- [Challenge 3:](#3-challenge-reduce-function) *reduce()* function
- [Challenge 4:](#4-challenge-sort-function) *sort()* function
- [Challenge 5:](#5-challenge-pipeline-for-product-codes) Pipeline for Product Codes
- [Challenge 6:](#6-challenge-run-unit-tests) Run Unit Tests
- [Challenge 7:](#7-challenge-sign-off) Sign-off
Points: [1, 2, 3, 3, 2, 0, 1]
&nbsp;
### 1.) Challenge: Data Streams in Python
Data streams are powerful abstractions for data-driven applications that also work in distributed environments. Big Data platforms often build on streams such as
[Spark Streams](https://spark.apache.org/docs/latest/streaming-programming-guide.html) or
[Kafka](https://kafka.apache.org/documentation/streams).
A data stream starts with a *source* (here just a list of names) followed by a pipeline of *chainable operations* performed on each data element passing through the stream. Results can be collected at the *terminus* of the stream.
Pull Python file [stream.py](stream.py).
```py
class Stream:
"""
Class of a data stream comprised of a sequence of stream operations:
"""
class __Stream_op:
"""
Inner class of one stream operation with chainable functions.
Instances comprise the stream pipeline.
"""
def slice(self, i1, i2=None, i3=1):
# function that returns new __Stream_op instance that slices stream
if i2 == None:
i2, i1 = i1, 0
#
return self.__new(self.__data[i1:i2:i3])
def filter(self, filter_func=lambda d : True) ...
# return new __Stream_op instance that passes only elements for
# which filter_func yields True
def map(self, map_func=lambda d : d) ...
# return new __Stream_op instance that passes elements resulting
# from map_func of corresponding elements in the inbound stream
def reduce(self, reduce_func, start=0) -> any: ...
# terminal function that returns single value compounded by reduce_func
def sort(self, comperator_func=lambda d1, d2 : True) ...
# return new __Stream_op instance that passes stream sorted by
# comperator_func
def cond(self, cond: bool, conditional): ...
# return same __Stream_op instance or apply conditional function
# on __Stream_op instance if condition yields True
def print(self) ...
# return same, unchanged __Stream_op instance and print as side effect
def count(self) -> int: ...
# terminal function that returns number of elements in terminal stream
def get(self) -> any: ...
# terminal function that returns final stream __data
```
Application of the stream can demonstrated by the example of a stream of names. The stream is instantiated from the `names` list. The `source()` - method returns the first `__Stream_op` - instance onto which chainable stream methods can be attached.
The stream in the example filters names of lenght = 4, prints those names and counts their number. The *lambda*-expression controls the filter process. Only names of length 4 pass to subsequent pipeline operations.
```py
names = ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez',
'Howe', 'Ray', 'Navarro', 'Talley', 'Bernard', 'Gomez', 'Hamilton',
'Case', 'Petty', 'Lott', 'Casey', 'Hall', 'Pena', 'Witt', 'Joyner',
'Raymond', 'Crane', 'Hendricks', 'Vance', 'Cleveland', 'Duncan', 'Soto',
'Brock', 'Graham', 'Nielsen', 'Rutledge', 'Strong', 'Cox']
result = Stream(names).source() \
.filter(lambda n : len(n) == 4) \
.print() \
.count()
print(f'found {result} names with 4 letters.')
```
Output:
```c++
['Gill', 'Howe', 'Case', 'Lott', 'Hall', 'Pena', 'Witt', 'Soto']
found 8 names with 4 letters.
```
**Questions:**
- How does method chaining work?
- What is required for chainable methods?
- How does a data pipeline gets formed in the example?
- Draw a sketch of data objects and how they are linked from the example above.
(1 Pts)
&nbsp;
### 2.) Challenge: *map()* function
Complete the `map()` function in [stream.py](stream.py) so that the example produces
the desired result: Names are mapped to name lengths for the first 8 names.
Name lengths are then compounded to a single result.
```py
result = Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print()
```
Output:
```c++
['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez', 'Howe', 'Ray']
[8, 4, 6, 10, 7, 7, 4, 3]
```
(2 Pts)
&nbsp;
### 3.) Challenge: *reduce()* function
Complete the `reduce()` function in [stream.py](stream.py) so that name lengths are
compounded (added one after another) to a single result.
```py
result = Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print() \
.reduce(lambda x, y : x + y)
#
print(f'compound number of letters in names is: {result}.')
```
Output:
```c++
['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez', 'Howe', 'Ray']
[8, 4, 6, 10, 7, 7, 4, 3]
compound number of letters in names is: 49.
```
(2 Pts)
3.1) Test your implementation to also work for the next example that produces
a single string of all n-letter names:
```py
n = 5
result = Stream(names).source() \
.filter(lambda name : len(name) == n) \
.print() \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
#
print(f'compounded {n}-letter names: {result}.')
```
Output for n=3 and n=5:
```c++
['Ray', 'Cox']
compounded 3-letter names: RAYCOX.
['Gomez', 'Petty', 'Casey', 'Crane', 'Vance', 'Brock']
compounded 5-letter names: GOMEZPETTYCASEYCRANEVANCEBROCK.
```
(1 Pts)
&nbsp;
### 4.) Challenge: *sort()* function
Complete the `sort()` function in [stream.py](stream.py) so that the example produces
the desired result (use Python's built-in `sort()` or `sorted()` functions).
```py
Stream(names).source() \
.slice(8) \
.print('unsorted: ') \
.sort() \
.print(' sorted: ')
```
Output:
```c++
unsorted: ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez', 'Howe', 'Ray']
sorted: ['Buckner', 'Gill', 'Gonzalez', 'Hardin', 'Howe', 'Marquez', 'Ray', 'Richardson']
```
(1 Pts)
4.1) Understand the sorted sequence below and define a `comperator` (expression that compares two elements (n1, n2) and yields `-1` if n1 should come before n2, `+1` if n1 must be after n2 or `0` if n1 is equal to n2):
```py
len_alpha_comperator = lambda ...
Stream(names).source() \
.sort(len_alpha_comperator) \
.print('sorted: ')
```
Output:
```c++
sorted: ['Cox', 'Ray', 'Case', 'Gill', 'Hall', 'Howe', 'Lott', 'Pena', 'Soto', 'Witt', 'Brock', 'Casey', 'Crane', 'Gomez', 'Petty', 'Vance', 'Duncan', 'Graham', 'Hardin', 'Joyner', 'Strong', 'Talley', 'Bernard', 'Buckner', 'Marquez', 'Navarro', 'Nielsen', 'Raymond', 'Gonzalez', 'Hamilton', 'Rutledge', 'Cleveland', 'Hendricks', 'Richardson']
```
(1 Pts)
4.2) Extend the pipeline so that it produces the following output:
```c++
sorted: [('Cox', 'Xoc', 3), ('Ray', 'Yar', 3), ('Brock', 'Kcorb', 5), ('Casey', 'Yesac', 5), ('Crane', 'Enarc', 5), ('Gomez', 'Zemog', 5), ('Petty', 'Yttep', 5), ('Vance', 'Ecnav', 5), ('Bernard', 'Dranreb', 7), ('Buckner', 'Renkcub', 7), ('Marquez', 'Zeuqram', 7), ('Navarro', 'Orravan', 7), ('Nielsen', 'Neslein', 7), ('Raymond', 'Dnomyar', 7), ('Cleveland', 'Dnalevelc', 9), ('Hendricks', 'Skcirdneh', 9)]
\\
16 odd-length names found.
```
(1 Pts)
&nbsp;
### 5.) Challenge: Pipeline for Product Codes
Build a pipeline that produces batches of five 6-digit numbers with prefix 'X'.
Numbers are in ascending order within each batch and end with a 1-digit checksum
after a dash. The checksum is the sum of all six digits of the random number modulo 10.
```py
for i in range(1, 5):
# Stream of 5 random numbers from integer range, feel free to change
codes = Stream([random.randint(100000,999999) for j in range(5)]).source() \
... \
.get()
#
print(f'batch {i}: {codes}')
```
Output:
```c++
batch 1: ['X102042-9', 'X102180-2', 'X103228-6', 'X104680-9', 'X106782-4']
batch 2: ['X200064-2', 'X200732-4', 'X202090-3', 'X209056-2', 'X211464-8']
batch 3: ['X300186-8', 'X301416-5', 'X305962-5', 'X307938-0', 'X312524-7']
batch 4: ['X400216-3', 'X401436-8', 'X401682-1', 'X405256-2', 'X406376-6']
```
(1 Pts)
5.1) Alter the pipeline such that it produces only even digit codes:
```c++
batch 1: ['X226840-2', 'X284240-0', 'X448288-4', 'X804080-0', 'X888620-2']
batch 2: ['X220640-4', 'X248066-6', 'X648466-4', 'X680404-2', 'X882868-0']
batch 3: ['X262626-4', 'X608662-8', 'X626404-2', 'X662424-4', 'X846228-0']
batch 4: ['X224200-0', 'X282204-8', 'X448426-8', 'X600282-8', 'X802882-8']
```
(1 Pts)
&nbsp;
### 6.) Challenge: Run Unit Tests
Pull file
[test_stream.py](test_stream.py)
into same directory. Run unit tests to confirm the correctness of your solution.
```sh
cd E_streams # change to directory where stream.py and test_strean.py are
test_url=https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/raw/main/E_streams/test_stream.py
curl -O $(echo $test_url) # download file with unit tests from URL
python test_stream.py # run tests from test file
python -m unittest --verbose # run unit tests with discovery
```
Output:
```sh
Ran 12 tests in 0.001s
OK
Unit testing using test objects:
- test_filter_1()
- test_filter_11()
- test_filter_12()
- test_filter_13()
- test_map_2()
- test_map_21()
- test_reduce_3()
- test_reduce_31()
- test_sort_4()
- test_sort_41()
- test_sort_42()
- test_stream_generation()
---> 12/12 TESTS SUCCEEDED
```
&nbsp;
### 7.) Challenge: Sign-off
For sign-off, change into `E_streams` directory and copy commands into a terminal:
```sh
test_url=https://gitlab.bht-berlin.de/sgraupner/ds_cs4bd_2324/-/raw/main/E_streams/test_stream.py
curl $test_url | python # run tests from URL (use for sign-off)
```
Result:
```
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 6264 100 6264 0 0 38354 0 --:--:-- --:--:-- --:--:-- 38666
Unit testing using test objects:
<stdin>:153: DeprecationWarning: unittest.makeSuite() is deprecated and will be
removed in Python 3.13. Please use unittest.TestLoader.loadTestsFromTestCase() i
nstead.
----------------------------------------------------------------------
Ran 12 tests in 0.001s
OK
```
12 tests succeeded.
(1 Pts)
"""
Special file __init__.py marks a directory as a Python module.
The file is executed once when any .py file is imported.
//
Python unittest require the presence of (even an empty) file.
"""
# load setup module when executed in parent directory
try:
__import__('setup')
#
except ImportError:
pass
import random
"""
Class of a data stream comprised of a sequence of stream operations:
- slice(i1, i2, i3) # slice stream in analogy to python slicing
- filter(filter_func) # pass only elements for which filter_func yields True
- map(map_func) # pass stream where each element is mapped by map_func
- sort(comperator_func) # pass stream sorted by comperator_func
- cond(cond, cond_func) # pass stream or apply conditional function
- print() # pass unchanged stream and print as side effect
and with terminal functions:
- reduce(reduce_func, start) # compound stream to single value with reduce_func
- count() # return number of elements in terminal stream
- get() # return final stream data
"""
class Stream:
def __init__(self, _data=[]):
# constructor to initialize instance member variables
#
self.__streamSource = self.__new_op(_data)
class __Stream_op:
"""
Inner class of one stream operation with chainable functions.
Instances comprise the stream pipeline.
"""
def __init__(self, _new_op_func, _data):
self.__data = _data
self.__new = _new_op_func # __new_op() function injected from outer context
def slice(self, i1, i2=None, i3=1):
# function that returns new __Stream_op instance that slices stream
if i2 == None:
# flip i1, i2 for single arg, e.g. slice(0, 8), slice(8)
i2, i1 = i1, 0
#
# return new __Stream_op instance with sliced __data
return self.__new(self.__data[i1:i2:i3])
def filter(self, filter_func=lambda d : True):
# return new __Stream_op instance that passes only elements for
# which filter_func yields True
#
return self.__new([d for d in self.__data if filter_func(d)])
def map(self, map_func=lambda d : d):
# return new __Stream_op instance that passes elements resulting
# from map_func of corresponding elements in the inbound stream
#
# input data is list of current instance: self.__data
# mapping means a new list needs to be created with same number of
# elements, each obtained by applying map_func
# create new data for next __Stream_op instance from current instance
# data: self.__data
new_data = self.__data # <-- compute new data here
# create new __Stream_op instance with new stream data
new_stream_op_instance = self.__new(new_data)
return new_stream_op_instance
def reduce(self, reduce_func=lambda compound, d : compound + d, start=0) -> any:
# terminal function that returns single value compounded by reduce_func
#
compound = 0 # <-- compute compound result here
return compound
def sort(self, comperator_func=lambda n1, n2 : -1 if n1 < n2 else 1):
# return new __Stream_op instance that passes stream sorted by
# comperator_func
#
# create new data for next __Stream_op instance from current instance
# data: self.__data
new_data = self.__data # <-- compute new data here
# create new __Stream_op instance with new stream data
new_stream_op_instance = self.__new(new_data)
return new_stream_op_instance
def cond(self, cond: bool, conditional):
# return same __Stream_op instance or apply conditional function
# on __Stream_op instance if condition yields True
#
return conditional(self) if cond else self
def print(self, prefix=''):
# return same, unchanged __Stream_op instance and print as side effect
#
print(f'{prefix}{self.__data}')
return self
def count(self) -> int:
# terminal function that returns number of elements in terminal stream
#
return len(self.__data)
def get(self) -> any:
# terminal function that returns final stream __data
#
return self.__data
def source(self):
# return first __Stream_op instance of stream as source
#
return self.__streamSource
def __new_op(self, *argv):
# private method to create new __Stream_op instance
return Stream.__Stream_op(self.__new_op, *argv)
# attempt to load solution module (ignore)
try:
_from, _import = 'stream_sol', 'Stream'
# fetch Stream class from solution, if present
Stream = getattr(__import__(_from, fromlist=[_import]), _import)
#
except ImportError:
pass
if __name__ == '__main__':
run_choice = 3
#
run_choices = {
1: "Challenge 1, Data streams in Python, run the first example",
2: "Challenge 2, complete map() function",
3: "Challenge 3, complete reduce() function",
31: "Challenge 3.1, example RAYCOX",
4: "Challenge 4, complete sort() function",
41: "Challenge 4.1, len-alpha comperator",
42: "Challenge 4.2, tuple output: ('Cox', 'Xoc', 3)",
5: "Challenge 5, Pipeline for product codes",
51: "Challenge 5.1, even digit codes"
}
names = ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez',
'Howe', 'Ray', 'Navarro', 'Talley', 'Bernard', 'Gomez', 'Hamilton',
'Case', 'Petty', 'Lott', 'Casey', 'Hall', 'Pena', 'Witt', 'Joyner',
'Raymond', 'Crane', 'Hendricks', 'Vance', 'Cleveland', 'Duncan', 'Soto',
'Brock', 'Graham', 'Nielsen', 'Rutledge', 'Strong', 'Cox']
if run_choice == 1:
# Challenge 1, Data streams in Python, run the first example
result = Stream(names).source() \
.filter(lambda n : len(n) == 4) \
.print() \
.count()
#
print(f'found {result} names with 4 letters.')
if run_choice == 2:
# Challenge 2, complete map() function
# to map names to name lengths for the first 8 names
Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print()
if run_choice == 3:
# Challenge 3, complete reduce() function
# to compound all name lengths to a single result
result = Stream(names).source() \
.slice(8) \
.print() \
.map(lambda n : len(n)) \
.print() \
.reduce(lambda x, y : x + y)
#
print(f'compound number of letters in names is: {result}.')
if run_choice == 31:
# Challenge 3.1, example RAYCOX
# compound single string of all n-letter names
n = 3
result = Stream(names).source() \
.filter(lambda name : len(name) == n) \
.print() \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
#
print(f'compounded {n}-letter names: {result}.')
if run_choice == 4:
# Challenge 4, complete sort() function
Stream(names).source() \
.slice(8) \
.print('unsorted: ') \
.sort() \
.print(' sorted: ')
alpha_comperator = lambda n1, n2 : -1 if n1 < n2 else 1
len_alpha_comperator = lambda n1, n2 : -1 if len(n1) < len(n2) else 1 if len(n1) > len(n2) else alpha_comperator(n1, n2)
#
if run_choice == 41:
# Challenge 4.1, len-alpha comperator
Stream(names).source() \
.sort(len_alpha_comperator) \
.print('sorted: ')
if run_choice == 42:
# Challenge 4.2, tuple output: ('Cox', 'Xoc', 3)
result = Stream(names).source() \
.sort(len_alpha_comperator) \
.map(lambda n : (n, n[::-1].capitalize(), len(n))) \
.filter(lambda n1 : n1[2] % 2 == 1) \
.print('sorted: ') \
.count()
#
print(f'\\\\\n{result} odd-length names found.')
# rand_numbers = [random.randint(100000,999999) for i in range(30)]
# print(f'random numbers: {rand_numbers}')
#
if run_choice == 5 or run_choice == 51:
# Challenge 5, Pipeline for product codes
# Challenge 5.1, even digit codes
#
for i in range(1, 5):
# Stream of 5 random numbers from integer range, feel free to change
codes = Stream([random.randint(100000,999999) for j in range(1000)]).source() \
.filter(lambda n : n % 2 == 0) \
.cond( run_choice == 51, \
# use only numbers with even digits, test by split up number in sequence of digits
lambda op : op.filter(lambda n : len(set(map(int, str(n))).intersection([1, 3, 5, 7, 9])) == 0) \
) \
.slice(5) \
.sort() \
.map(lambda n : f'X{n}-{sum(list(map(int, str(n)))) % 10}') \
.get()
#
print(f'batch {i}: {codes}')
import unittest
from stream import Stream
class Stream_test(unittest.TestCase):
"""
Test class.
"""
list_1 = [4, 12, 3, 8, 17, 12, 1, 8, 7]
list_1_str = [str(d) for d in list_1]
names = ['Gonzalez', 'Gill', 'Hardin', 'Richardson', 'Buckner', 'Marquez',
'Howe', 'Ray', 'Navarro', 'Talley', 'Bernard', 'Gomez', 'Hamilton',
'Case', 'Petty', 'Lott', 'Casey', 'Hall', 'Pena', 'Witt', 'Joyner',
'Raymond', 'Crane', 'Hendricks', 'Vance', 'Cleveland', 'Duncan', 'Soto',
'Brock', 'Graham', 'Nielsen', 'Rutledge', 'Strong', 'Cox']
# tests for stream generation function
def test_stream_generation(self):
#
result = Stream(self.list_1).source() \
.get()
self.assertEqual(self.list_1, result)
# tests for filter() function
def test_filter_1(self):
#
# test Challenge 1
result = Stream(self.list_1).source() \
.filter(lambda n : n % 2 == 1) \
.get()
self.assertEqual([3, 17, 1, 7], result)
def test_filter_11(self):
result = Stream(self.list_1).source() \
.filter(lambda d : False) \
.get()
self.assertEqual([], result)
def test_filter_12(self):
result = Stream(self.list_1).source() \
.filter(lambda d : True) \
.get()
self.assertEqual(self.list_1, result)
def test_filter_13(self):
result = Stream(self.names).source() \
.filter(lambda n : len(n) == 4) \
.get()
self.assertEqual(['Gill', 'Howe', 'Case', 'Lott', 'Hall', 'Pena', 'Witt', 'Soto'], result)
# tests for map() function
def test_map_2(self):
#
# test Challenge 2
result = Stream(self.names).source() \
.slice(8) \
.map(lambda n : len(n)) \
.get()
self.assertEqual([8, 4, 6, 10, 7, 7, 4, 3], result)
def test_map_21(self):
result = Stream(self.names).source() \
.filter(lambda n : len(n) == 3) \
.map(lambda n : (n, len(n))) \
.get()
self.assertEqual([('Ray', 3), ('Cox', 3)], result)
# tests for reduce() function
def test_reduce_3(self):
#
# test Challenge 3
result = Stream(self.names).source() \
.slice(8) \
.map(lambda n : len(n)) \
.reduce(lambda x, y : x + y)
self.assertEqual(49, result)
def test_reduce_31(self):
# test Challenge 3.1
n = 3
result = Stream(self.names).source() \
.filter(lambda name : len(name) == n) \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
self.assertEqual('RAYCOX', result)
#
n = 5
result = Stream(self.names).source() \
.filter(lambda name : len(name) == n) \
.map(lambda n : n.upper()) \
.reduce(lambda x, y : str(x) + str(y), '')
self.assertEqual('GOMEZPETTYCASEYCRANEVANCEBROCK', result)
# tests for sort() function
def test_sort_4(self):
# test Challenge 4
result = Stream(self.names).source() \
.slice(8) \
.sort() \
.get()
expected = ['Buckner', 'Gill', 'Gonzalez', 'Hardin', 'Howe', 'Marquez', 'Ray', 'Richardson']
self.assertEqual(expected, result)
def alpha_comperator(self, n1, n2):
return -1 if n1 < n2 else 1
def len_alpha_comperator(self, n1, n2):
return -1 if len(n1) < len(n2) else 1 if len(n1) > len(n2) else self.alpha_comperator(n1, n2)
def test_sort_41(self):
# test Challenge 4.1
result = Stream(self.names).source() \
.sort(self.len_alpha_comperator) \
.get()
#
expected = ['Cox', 'Ray', 'Case', 'Gill', 'Hall', 'Howe', 'Lott', 'Pena', 'Soto', 'Witt',
'Brock', 'Casey', 'Crane', 'Gomez', 'Petty', 'Vance', 'Duncan', 'Graham', 'Hardin',
'Joyner', 'Strong', 'Talley', 'Bernard', 'Buckner', 'Marquez', 'Navarro', 'Nielsen',
'Raymond', 'Gonzalez', 'Hamilton', 'Rutledge', 'Cleveland', 'Hendricks', 'Richardson'
]
self.assertEqual(expected, result)
def test_sort_42(self):
# test Challenge 4.2
result = Stream(self.names).source() \
.sort(self.len_alpha_comperator) \
.map(lambda n : (n, n[::-1].capitalize(), len(n))) \
.filter(lambda n1 : n1[2] % 2 == 1) \
.get()
#
expected = [('Cox', 'Xoc', 3), ('Ray', 'Yar', 3), ('Brock', 'Kcorb', 5), ('Casey', 'Yesac', 5),
('Crane', 'Enarc', 5), ('Gomez', 'Zemog', 5), ('Petty', 'Yttep', 5), ('Vance', 'Ecnav', 5),
('Bernard', 'Dranreb', 7), ('Buckner', 'Renkcub', 7), ('Marquez', 'Zeuqram', 7),
('Navarro', 'Orravan', 7), ('Nielsen', 'Neslein', 7), ('Raymond', 'Dnomyar', 7),
('Cleveland', 'Dnalevelc', 9), ('Hendricks', 'Skcirdneh', 9)
]
self.assertEqual(expected, result)
#
result = Stream(self.names).source() \
.sort(self.len_alpha_comperator) \
.map(lambda n : (n, n[::-1].capitalize(), len(n))) \
.filter(lambda n1 : n1[2] % 2 == 1) \
.count()
self.assertEqual(16, result)
# report results,
# see https://stackoverflow.com/questions/28500267/python-unittest-count-tests
# currentResult = None
# @classmethod
# def setResult(cls, amount, errors, failures, skipped):
# cls.amount, cls.errors, cls.failures, cls.skipped = \
# amount, errors, failures, skipped
# def tearDown(self):
# amount = self.currentResult.testsRun
# errors = self.currentResult.errors
# failures = self.currentResult.failures
# skipped = self.currentResult.skipped
# self.setResult(amount, errors, failures, skipped)
# @classmethod
# def tearDownClass(cls):
# print("\ntests run: " + str(cls.amount))
# print("errors: " + str(len(cls.errors)))
# print("failures: " + str(len(cls.failures)))
# print("success: " + str(cls.amount - len(cls.errors) - len(cls.failures)))
# print("skipped: " + str(len(cls.skipped)))
# def run(self, result=None):
# self.currentResult = result # remember result for use in tearDown
# unittest.TestCase.run(self, result) # call superclass run method
if __name__ == '__main__':
result = unittest.main()
# Assignment F: Graph Data &nbsp; (10 Pts)
### Challenges
- [Challenge 1:](#1-challenge-understanding-graph-data) Understanding Graph Data
- [Challenge 2:](#2-challenge-representing-graph-data-in-python) Representing Graph Data in Python
- [Challenge 3:](#3-challenge-implementing-the-graph-in-python) Implementing the Graph in Python
- [Challenge 4:](#4-challenge-implementing-dijkstras-shortest-path-algorithm) Implementing Dijkstra's Shortest Path Algorithm
- [Challenge 5:](#5-challenge-run-for-another-graph) Run for Another Graph
Points: [1, 1, 2, 4, 2]
&nbsp;
### 1.) Challenge: Understanding Graph Data
A *[Graph](https://en.wikipedia.org/wiki/Graph_theory)*
is a set of nodes (vertices) and edges connecting nodes G = { n ∈ N, e ∈ E }.
A *weighted Graph* has a *weight* (number) associated to each egde.
A *[Path](https://en.wikipedia.org/wiki/Path_(graph_theory))*
is a subset of edges that connects a subset of nodes.
We consider Complete Graphs where all nodes can be reached from any other
node by at least one path (no disconnected subgraphs).
Graphs may have cycles (paths that lead to nodes visited before) or
paths may join at nodes that are part of other paths.
Traversal is the process of visiting each node of a graph exactly once.
Multiple visits of graph nodes by cycles or joins must be detected by
marking visited nodes (which is not preferred since it alters the data set)
or by keeping a separate record of visits.
Write two properties that distinguish graphs from trees.
(1 Pt)
&nbsp;
### 2.) Challenge: Representing Graph Data in Python
Python has no built-in data type that supports graph data.
Separate packages my be used such as
[NetworkX](https://networkx.org/).
In this assignment, we focus on basic Python data structures.
1. How can Graphs be represented in general?
1. How can these by implemented using Python base data structures?
1. Which data structure would be efficient giving the fact that in the
example below that graph is constant and only traversal operations
are performed?
(1 Pt)
&nbsp;
### 3.) Challenge: Implementing the Graph in Python
Watch the video and understand how
[Dijkstra's Shortest Path Algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm)
(1956) works and which information it needs.
*Edsger W. Dijkstra* (1930-2003,
[bio](https://en.wikipedia.org/wiki/Edsger_W._Dijkstra))
was a Dutch computer scientist, programmer and software engineer.
He was a professor of Computer Science at the Univerity of Austin, Texas
and has received numerous awards, including the
[Turing Award](https://en.wikipedia.org/wiki/Turing_Award)
in 1972.
<!--
[video (FelixTechTips)](https://youtu.be/bZkzH5x0SKU?si=n8Z2ZIfbB73_v1TE)
<img src="../markup/img/graph_2a.jpg" alt="drawing" width="640"/>
-->
[Video (Mike Pound, Computerphile)](https://youtu.be/GazC3A4OQTE?si=ZuBEcWaBzuKmPMqA)
<img src="../markup/img/graph_1.jpg" alt="drawing" width="640"/>
Node `S` forms the start of the algorithm, node `E` is the destination.
Draw a sketch of the data structures needed to represent the graph with
nodes, edges and weights and also the data needed for the algorithm.
Create a Python file `shortest_path.py` with
- declarations of data structures you may need for the graph and
information for the algorithm and
- data to represent the graph in the video with nodes: {A ... K, S} and
the shown edges with weights.
(2 Pts)
&nbsp;
### 4.) Challenge: Implementing Dijkstra's Shortest Path Algorithm
Implement Dijkstra's Algorithm.
Output the sortest path as sequence of nodes, followed by an analysis and
the shortest distance.
```
shortest path: S -> B -> H -> G -> E
analysis:
S->B(2)
B->H(1)
H->G(2)
G->E(2)
shortest distance is: 7
```
(4 Pts)
&nbsp;
### 5.) Challenge: Run for Another Graph
Run your algorithm for another graph G: {A ... F} with weights:
```
G: {A, B, C, D, E, F}, start: A, end: C
Weights:
AB(2), BE(6), EC(9), AD(8), BD(5),
DE(3), DF(2), EF(1), FC(3)
```
Output the result:
```
shortest path: A -> B -> D -> F -> C
analysis:
S->B(2)
B->D(5)
D->F(2)
F->C(3)
shortest distance is: 12
```
(2 Pts)
# Assignment G: Docker &nbsp; (18 Pts)
This assignment will setup Docker. If you already have it, simply run challenges and answer questions.
Docker is a popular software packaging, distribution and execution infrastructure using containers.
- Docker runs on Linux only (LXC). Mac, Windows have built adapter technologies.
- Windows uses an internal Linux VM to run Docker engine ( *dockerd* ).
- Client tools (CLI, GUI, e.g.
[Docker Desktop](https://docs.docker.com/desktop/install/windows-install/)
for Windows) are used to manage and execute containers.
Docker builds on Linux technologies:
- stackable layers of filesystem images that each contain only a diff to an underlying image.
- tools to build, manage and distribute layered images ("ship containers").
- Linux LXC technology to “execute containers” as groups of isolated processes on a Linux system (create/run a new container, start/stop/join container).
Salomon Hykes, PyCon 2013, Santa Clara CA: *"The Future of Linux Containers"* ([watch](https://www.youtube.com/watch?v=9xciauwbsuo), 5:21min).
### Challenges
- [Challenge 1:](#1-challenge-docker-setup-and-cli) Docker Setup and CLI
- [Challenge 2:](#2-challenge-run-hello-world-container) Run *hello-world* Container
- [Challenge 3:](#3-challenge-run-minimal-alpine-python-container) Run minimal (Alpine) Python Container
- [Challenge 4:](#4-challenge-configure-alpine-container-for-ssh) Configure Alpine Container for *ssh*
- [Challenge 5:](#5-challenge-build-alpine-python-container-with-ssh-access) Build Alpine-Python Container with *ssh*-Access
- [Challenge 6:](#6-challenge-setup-ide-to-develop-code-in-alpine-python-container) Setup IDE to develop Code in Alpine-Python Container
- [Challenge 7:](#7-challenge-run-jupyter-in-docker-container) Run *Jupyter* in Docker Container
Points: [2, 2, 2, 4, 4, 2, 2]
&nbsp;
### 1.) Challenge: Docker Setup and CLI
[Docker Desktop](https://docs.docker.com/desktop)
bundles all necessary Docker components necessary to run Docker on your
system (Windows, Mac, Linux). It comes with a GUI that makes using Docker
easier and is recommended for beginners.
Components can also be installed individually (e.g. "Docker Engine"), but this
may involve installation of dependencies such as the WSL virtual machine on Windows.
Docker CLI
Docker CLI is the Docker command-line interface that is needed to run docker
commands in a terminal.
After setting up Docker Desktop, open a terminal and type commands:
```sh
> docker --version
Docker version 20.10.17, build 100c701
> docker --help
...
> docker ps ; dockerd is not running
error during connect: This error may indicate that the docker daemon is not runn
ing.
> docker ps ; dockerd is now running, no containers yet
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
```
If you can't run the `docker` command, the client-side **docker-CLI** (Command-Line-Interface) may not be installed or not on the PATH-variable. If `docker ps` says: "can't connect", the **Docker engine** (server-side: *dockerd* ) is not running and must be started.
(2 Pts)
&nbsp;
### 2.) Challenge: Run *hello-world* Container
Run the *hello-world* container from Docker-Hub: [hello-world](https://hub.docker.com/_/hello-world):
```sh
> docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
2db29710123e: Pull complete
Digest: sha256:62af9efd515a25f84961b70f973a798d2eca956b1b2b026d0a4a63a3b0b6a3f2
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
```
Show the container image loaded on your system:
```sh
> docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest feb5d9fea6a5 12 months ago 13.3kB
```
Show that the container is still present after the end of execution:
```sh
> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
da16000022e0 hello-world "/hello" 6 min ago Exited(0) magical_aryabhata
```
Re-start the container with an attached (-a) *stdout* terminal.
Refer to the container either by its ID ( *da16000022e0* ) or by its
generated NAME ( *magical_aryabhata* ).
```sh
> docker start da16000022e0 -a or: docker start magical_aryabhata -a
Hello from Docker!
This message shows that your installation appears to be working correctly.
```
Re-run will create a new container and execut it. `docker ps -a ` will then
show two containers created from the same image.
```sh
> docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
da16000022e0 hello-world "/hello" 6 min ago Exited(0) magical_aryabhata
40e605d9b027 hello-world "/hello" 4 sec ago Exited(0) pedantic_rubin
```
"Run" always creates new containers while "start" restarts existing containers.
(2 Pts)
&nbsp;
### 3.) Challenge: Run minimal (Alpine) Python Container
[Alpine](https://www.alpinelinux.org) is a minimal base image that has become
popular for building lean containers (few MB as opposed to 100's of MB or GB's).
Being mindful of resources is important for container deployments in cloud
environments where large numbers of containers are deployed and resource use
is billed.
Pull the latest Alpine image from Docker-Hub (no container is created with just
pulling the image). Mind image sizes: hello-world (13.3kB), alpine (5.54MB).
```sh
> docker pull alpine:latest
docker pull alpine:latest
latest: Pulling from library/alpine
Digest: sha256:bc41182d7ef5ffc53a40b044e725193bc10142a1243f395ee852a8d9730fc2ad
Status: Image is up to date for alpine:latest
docker.io/library/alpine:latest
> docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest feb5d9fea6a5 12 months ago 13.3kB
alpine latest 9c6f07244728 8 weeks ago 5.54MB
```
Create and run an Alpine container executing an interactive shell `/bin/sh` attached to the terminal ( `-it` ). It launches the shell that runs commands inside the Alpine
container.
```sh
> docker run -it alpine:latest /bin/sh
# ls -la
total 64
drwxr-xr-x 1 root root 4096 Oct 5 18:32 .
drwxr-xr-x 1 root root 4096 Oct 5 18:32 ..
-rwxr-xr-x 1 root root 0 Oct 5 18:32 .dockerenv
drwxr-xr-x 2 root root 4096 Aug 9 08:47 bin
drwxr-xr-x 5 root root 360 Oct 5 18:32 dev
drwxr-xr-x 1 root root 4096 Oct 5 18:32 etc
drwxr-xr-x 2 root root 4096 Aug 9 08:47 home
drwxr-xr-x 7 root root 4096 Aug 9 08:47 lib
drwxr-xr-x 5 root root 4096 Aug 9 08:47 media
drwxr-xr-x 2 root root 4096 Aug 9 08:47 mnt
drwxr-xr-x 2 root root 4096 Aug 9 08:47 opt
dr-xr-xr-x 179 root root 0 Oct 5 18:32 proc
drwx------ 1 root root 4096 Oct 5 18:36 root
drwxr-xr-x 2 root root 4096 Aug 9 08:47 run
drwxr-xr-x 2 root root 4096 Aug 9 08:47 sbin
drwxr-xr-x 2 root root 4096 Aug 9 08:47 srv
dr-xr-xr-x 13 root root 0 Oct 5 18:32 sys
drwxrwxrwt 2 root root 4096 Aug 9 08:47 tmp
drwxr-xr-x 7 root root 4096 Aug 9 08:47 usr
drwxr-xr-x 12 root root 4096 Aug 9 08:47 var
# whoami
root
# uname -a
Linux aab69035680f 5.10.124-linuxkit #1 SMP Thu Jun 30 08:19:10 UTC 2022 x86_64
# exit
```
Commands after the `#` prompt (*root* prompt) are executed by the `/bin/sh` shell
inside the container.
`# exit` ends the shell process and returns to the surrounding shell. The container
will go into a dormant (inactive) state.
```sh
> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
aab69035680f alpine:latest "/bin/sh" 9 min ago Exited boring_ramanujan
```
The container can be restarted with any number of `/bin/sh` shell processes.
Containers are executed by **process groups** - so-called
[cgroups](https://en.wikipedia.org/wiki/Cgroups) used by
[LXC](https://wiki.gentoo.org/wiki/LXC) -
that share the same environment (filesystem view, ports, etc.), but are isolated
from process groups of other containers.
Start a shell process in the dormant Alpine-container to re-activate.
The start command will execute the default command that is built into the container
(see the COMMAND column: `"/bin/sh"`). The option `-ai` attaches *stdout* and *stdin*
of the terminal to the container.
Write *"Hello, container"* into a file: `/tmp/hello.txt`. Don't leave the shell.
```sh
> docker start aab69035680f -ai
# echo "Hello, container!" > /tmp/hello.txt
# cat /tmp/hello.txt
Hello, container!
#
```
Start another shell in another terminal for the container. Since it refers to the same
container, both shell processes share the same filesystem.
The second shell can therefore see the file created by the first and append another
line, which again will be seen by the first shell.
```sh
> docker start aab69035680f -ai
# cat /tmp/hello.txt
Hello, container!
# echo "How are you?" >> /tmp/hello.txt
```
First terminal:
```sh
# cat /tmp/hello.txt
Hello, container!
How are you?
#
```
In order to perform other commands than the default command in a running container,
use `docker exec`.
Execute command: `cat /tmp/hello.txt` in a third terminal:
```sh
docker exec aab69035680f cat /tmp/hello.txt
Hello, container!
How are you?
```
The execuition creates a new process that runs in the container seeing its filesystem
and other resources.
Explain the next command:
- What is the result?
- How many processes are involved?
- Draw a skech with the container, processes and their stdin/-out connections.
```sh
echo "echo That\'s great to hear! >> /tmp/hello.txt" | \
docker exec -i aab69035680f /bin/sh
```
When all processes have exited, the container will return to the dormant state.
It will preserve the created file.
(2 Pts)
&nbsp;
### 4.) Challenge: Configure Alpine Container for *ssh*
Create a new Alpine container with name `alpine-ssh` and configure it for
[ssh](https://en.wikipedia.org/wiki/Secure_Shell) access.
```sh
docker run --name alpine-ssh -p 22:22 -it alpine:latest
```
Instructions for installation and confiduration can be found here:
["How to install OpenSSH server on Alpine Linux"](https://www.cyberciti.biz/faq/how-to-install-openssh-server-on-alpine-linux-including-docker) or here:
["Setting up a SSH server"](https://wiki.alpinelinux.org/wiki/Setting_up_a_SSH_server).
Add a local user *larry* with *sudo*-rights, install *sshd* listening on the
default port 22.
Write down commands that you used for setup and configuration to enable the
container to run *sshd*.
Verify that *sshd* is running in the container:
```sh
# ps -a
PID USER TIME COMMAND
1 root 0:00 /bin/sh
254 root 0:00 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups
261 root 0:00 ps -a
```
Show that *ssh* is working by login in as *larry* from another terminal:
```sh
> ssh larry@localhost
Welcome to Alpine!
The Alpine Wiki contains a large amount of how-to guides and general
information about administrating Alpine systems.
See <http://wiki.alpinelinux.org/>.
You can setup the system with the command: setup-alpine
You may change this message by editing /etc/motd.
54486c62d745:~$ whoami
larry
54486c62d745:~$ ls -la
total 32
drwxr-sr-x 1 larry larry 4096 Oct 2 21:34 .
drwxr-xr-x 1 root root 4096 Oct 2 20:40 ..
-rw------- 1 larry larry 602 Oct 5 18:53 .ash_history
54486c62d745:~$ uname -a
Linux 54486c62d745 5.10.124-linuxkit #1 SMP Thu Jun 30 08:19:10 UTC 2022 x86_64 Linux
54486c62d745:~$
```
(4 Pts)
&nbsp;
### 5.) Challenge: Build Alpine-Python Container with *ssh*-Access
[`python:latest`](https://hub.docker.com/_/python/tags) official image is 340MB while [`python:3.9.0-alpine`](https://hub.docker.com/_/python/tags?name=3.9-alpine&page=1) is ~18MB. The alpine-version builds on minimal Alpine Linux while the official version builds on Ubuntu. "Minimal" means available commands, tools inside the container is restricted. Only basic tools are available. Required additional tools need to be installed into the container.
Build an new ```alpine-python-sshd``` container based on the ```python:3.9.0-alpine``` image that includes Python 3.9.0 and ssh-access so that your IDE can remotely connect to the container and run/debug Python code inside the container, which is the final challenge.
Copy file [print_sys.py](https://github.com/sgra64/cs4bigdata/blob/main/A_setup_python/print_sys.py) from Assignment A into larry's ```$HOME``` directory and execute.
```sh
> ssh larry@localhost
Welcome to Alpine!
54486c62d745:~$ python print_sys.py
Python impl: CPython
Python version: #1 SMP Thu Jun 30 08:19:10 UTC 2022
Python machine: x86_64
Python system: Linux
Python version: 3.9.0
54486c62d745:~$
```
(4 Pts)
&nbsp;
### 6.) Challenge: Setup IDE to develop Code in Alpine-Python Container
Setup your IDE to run/debug Python code inside the `alpine-python-sshd` container. In Visual Studio Code (with extensions for "Remote Development", "Docker" and "Dev Containers"), go to the Docker side-Tab, Right-click the running container and "Attach Visual Studio Code". This opens a new VSCode Window with a view from inside the container with file [print_sys.py](https://github.com/sgra64/cs4bigdata/blob/main/A_setup_python/print_sys.py) from the previous challenge.
Run this file in the IDE connected to the container. Output will show it running under Linux, Python 3.9.0 in `/home/larry`.
<!-- ![Remote Code](Setup_img01.png) -->
<img src="../markup/img/G_docker_img01.png" alt="drawing" width="640"/>
(2 Pts)
&nbsp;
### 7.) Challenge: Run *Jupyter* in Docker Container
Setup a Jupyter-server from the [Jupyter Docker Stack](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html). Jupyter Docker Stacks are a set of ready-to-run Docker images containing Jupyter applications and interactive computing tools.
[Selecting an Image](https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html) decides about the features preinstalled for Jupyter. Configurations exit for *all-spark-notebook* building on *pyspark-notebook* building on *scipy-notebook*, which builds on a *minimal-* and *base-notebook*, which builds on an *Ubuntu LTS* distribution. Other variations exist for *tensorflow-*, *datascience-*, or *R-notebooks*.
![Remote Code](https://jupyter-docker-stacks.readthedocs.io/en/latest/_images/inherit.svg)
Pull the image for the *minimal-notebook* (415 MB, [tags](https://hub.docker.com/r/jupyter/minimal-notebook/tags/) ) and start it.
```sh
docker pull jupyter/minimal-notebook:latest
docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
jupyter/minimal-notebook latest 33f2fa3eb079 18h ago 1.39GB
```
Create the container with Jupyters default port 8888 exposed to the host environment.
Watch for the URL with the token in the output log.
```sh
docker run --name jupyter-minimal -p 8888:8888 jupyter/minimal-notebook
Entered start.sh with args: jupyter lab
Executing the command: jupyter lab
[I 2022-10-10 21:53:22.855 ServerApp] jupyterlab | extension was successfully linked.
...
To access the server, open this file in a browser:
http://127.0.0.1:8888/lab?token=6037ff448a79463b97e3c29af712b9395dd8548b
71d77769
```
After the first access with the token included in the URL, the browser
opens with http://localhost:8888/lab.
<!-- ![Remote Code](Setup_img02.png) -->
<img src="../markup/img/G_docker_img02.png" alt="drawing" width="640"/>
Start to work with Jupyter. A Jupyter notebook is a web-form comprised
of cells where Python commands can be entered. Execution is triggered
by `SHIFT + Enter`. Run the Code below (copy & paste from file *print_sys.py* ).
<!-- ![Remote Code](Setup_img03.png) -->
<img src="../markup/img/G_docker_img03.png" alt="drawing" width="640"/>
The notebook is stored **inside** the container under *Untitled.ipynb*.
Shut down the Jupyter-server with: File -> Shutdown.
Reatart the (same) container as "daemon" process running in the background (the container remembers flags given at creation: `-p 8888:8888`). The flag `-a` attaches a terminal to show log lines with the token-URL.
```sh
docker start jupyter-minimal -a
```
After login, the prior notebook with the code above is still there under *Untitled.ipynb*. Open and re-run the notebook.
A container preserves its state, e.g. files that get created. Docker
simply adds them in another image layer. Therefore, the size of a
container is only defined by state changes that occured after the
creation of the container instance.
Shut down the Jupyter server with: File -> Shutdown, not with:
```sh
docker stop jupyter-minimal
```
(2 Pts)