internetnl_domain_analyse package
Submodules
internetnl_domain_analyse.domain_analyse_classes module
- class internetnl_domain_analyse.domain_analyse_classes.DomainAnalyser(scan_data_key=None, cache_file_base='tables_df', cache_directory_base_name=None, tld_extract_cache_directory=None, output_file=None, reset=None, records_cache_info: RecordCacheInfo | None = None, internet_nl_filename=None, breakdown_labels=None, statistics: dict | None = None, default_scan=None, variables: dict | None = None, module_info: dict | None = None, weights=None, url_key='website_url', suffix_key='suffix', translations=None, module_key='module', variable_key='variable', sheet_renames=None, n_digits=None, write_dataframe_to_sqlite=False, statistics_to_xls=False, n_bins=100, mode=None, correlations=None, categories=None, dump_cache_as_sqlite=False)[source]
Bases:
object- get_correct_categories_count()[source]
Bekijk per record hoeveel categorieën goed zijn en geef terug als dataframe
- class internetnl_domain_analyse.domain_analyse_classes.DomainPlotter(scan_data, scan_data_key=None, default_scan=None, plot_info=None, show_plots=False, barh=False, image_directory=None, cache_directory=None, image_type='pdf', max_plots=None, tex_prepend_path=None, statistics=None, variables=None, cdf_plot=False, bar_plot=False, cor_plot=False, add_logo=True, cumulative=False, show_title=False, breakdown_labels=None, translations: dict | None = None, export_highcharts=False, highcharts_directory=None, correlations=None, tex_horizontal_shift=None, bovenschrift=True, variables_to_plot=None, exclude_variables=None, force_plots=False, latex_files=False, years_to_add_to_plot_legend=None, module_info=None, english=False)[source]
Bases:
object
- class internetnl_domain_analyse.domain_analyse_classes.ImageFileInfo(scan_data_key, cache_file_name_base='image_info', cache_directory='cache')[source]
Bases:
object
- class internetnl_domain_analyse.domain_analyse_classes.PlotInfo(variables_df, var_name, breakdown_name)[source]
Bases:
object
- class internetnl_domain_analyse.domain_analyse_classes.RecordCacheInfo(records_cache_data: dict, year_key: str, stat_directory: str | None = None)[source]
Bases:
object
- internetnl_domain_analyse.domain_analyse_classes.add_missing_years(plot_df, years_to_plot=None, jaar_level_name='Jaar', column=None)[source]
Voeg missende jaren toe
- Parameters:
plot_df – pd.DataFrame DataFrame om te plotetn
years_to_plot – list De jaren die we willen plotten
jaar_level_name – str De naam van de level= van de jaren
column – str Naam van de column voor de foutmelding
- Returns:
pd.DataFrame
- internetnl_domain_analyse.domain_analyse_classes.calculate_histogram_per_breakdown(data: DataFrame, var_key: str, df_weights: Series, n_bins: int = 100) dict[source]
Bereken per breakdown van de data het histogram die hoort bij var_key
- Parameters:
- Returns:
De histogrammen per breakdown
- Return type:
internetnl_domain_analyse.domain_plots module
- class internetnl_domain_analyse.domain_plots.AxisLabel(label_properties, text_default=None, positie_default=None)[source]
Bases:
objectclass om de eigenschappen van een as label op te slaan
- internetnl_domain_analyse.domain_plots.make_bar_plot(plot_df, plot_key, plot_variable, scan_data_key, module_name, question_name, image_directory, show_plots=False, add_logo=True, figsize=None, highcharts_height=None, image_type='pdf', reference_lines=None, xoff=0.02, yoff=0.02, show_title=False, barh=False, subplot_adjust=None, sort_values=False, y_max_bar_plot=None, y_spacing_bar_plot=None, translations=None, legend_position=None, legend_max_columns=None, box_margin=None, export_svg=False, export_highcharts=False, highcharts_directory=None, title=None, normalize_data=False, force_plot=False, enable_highcharts_legend=True, unit=None, english=False, bar_width=None)[source]
- internetnl_domain_analyse.domain_plots.make_bar_plot_horizontal(plot_df, fig, axis, margin, plot_title, show_title, translations, reference_lines, line_iter, xoff, yoff, trans, y_spacing_bar_plot, y_max_bar_plot, legend_position, legend_max_columns, add_logo=True, unit=None, english=False, bar_width=None)[source]
- internetnl_domain_analyse.domain_plots.make_bar_plot_stacked(year, plot_df, plot_key, plot_variable, scan_data_key, module_name, question_name, image_directory, show_plots=False, figsize=None, image_type='pdf', reference_lines=None, xoff=0.02, yoff=0.02, show_title=False, barh=False, subplot_adjust=None, sort_values=False, add_logo=True, y_max_bar_plot=None, y_spacing_bar_plot=None, translations=None, legend_position=None, box_margin=None, export_svg=False, export_highcharts=False, highcharts_directory=None, title=None, normalize_data=False, force_plot=False, enable_highcharts_legend=True, unit=None, english=False)[source]
- internetnl_domain_analyse.domain_plots.make_bar_plot_vertical(plot_df, axis, plot_title, show_title, translations, reference_lines, line_iter, xoff, yoff, trans, add_logo=True, unit=None, english=False)[source]
- internetnl_domain_analyse.domain_plots.make_cdf_plot(hist, grp_key, plot_key, scan_data_key, module_name=None, question_name=None, image_directory=None, show_plots=False, figsize=None, image_type=None, image_file_base=None, cummulative=False, reference_lines=None, xoff=None, yoff=None, y_max=None, y_spacing=None, translations=None, export_highcharts=None, export_svg=False, highcharts_info: dict | None = None, title: str | None = None, year: int | None = None, english=False)[source]
- internetnl_domain_analyse.domain_plots.make_conditional_pdf_plot(categories, image_directory, show_plots=False, export_highcharts=False, highcharts_directory=None, cache_directory=None, english=False)[source]
- internetnl_domain_analyse.domain_plots.make_conditional_score_plot(correlations, image_directory, show_plots=False, figsize=None, image_type='.pdf', export_svg=False, export_highcharts=False, highcharts_directory=None, title=None, cache_directory=None, english=False)[source]
- internetnl_domain_analyse.domain_plots.make_heatmap(correlations, image_directory, show_plots=False, figsize=None, image_type='.pdf', export_svg=False, export_highcharts=False, highcharts_directory=None, title=None, cache_directory=None, english=False)[source]
- internetnl_domain_analyse.domain_plots.make_verdeling_per_aantal_categorie(categories, image_directory, show_plots=False, export_highcharts=False, highcharts_directory=None, cache_directory=None, english=False)[source]
internetnl_domain_analyse.domein_analyse module
- internetnl_domain_analyse.domein_analyse.parse_args()[source]
Parse command line parameters
- Parameters:
args ([str]) – command line parameters as list of strings
- Returns:
command line parameters namespace
- Return type:
- internetnl_domain_analyse.domein_analyse.set_do_it_vlaggen(required_keys, chapter_info, recursive=False)[source]
Van een hoofdstukje uit je settings file, druk de do_it vlaggen op
- Parameters:
required_keys – list List van de items waarvan je de do_it vlag wilt opdrukken
chapter_info – de dictionary waarvan je de vlaggen zet.
recursive – bool Als dit een recursieve call is, dan willen we de waardes die niet in de lijst zitten niet op False zetten
- Returns: dict
De nieuwe dictionary.
internetnl_domain_analyse.latex_output module
- class internetnl_domain_analyse.latex_output.ExampleEnvironment(*, options=None, arguments=None, start_arguments=None, **kwargs)[source]
Bases:
EnvironmentA class representing a custom LaTeX environment.
This class represents a custom LaTeX environment named
exampleEnvironment.- packages = OrderedSet([Package(Arguments('mdframed'), Options())])
- class internetnl_domain_analyse.latex_output.SubFloat(arguments=None, options=None, *, extra_arguments=None)[source]
Bases:
CommandBaseA class representing a custom LaTeX command.
This class represents a custom LaTeX command named
exampleCommand.- packages = OrderedSet()
- internetnl_domain_analyse.latex_output.make_latex_overview(image_info=None, variables=None, image_directory=None, cache_directory=None, image_files=None, tex_horizontal_shift='-2cm', tex_prepend_path=None, bovenschrift=False, module_info=None)[source]
Maak latex output file met alle plaatjes
- Parameters:
module_info – class Informatie van de modules
cache_directory – obj:Path
image_info – object: ImageInfo
variables – dict met variabele eigenschappen
image_directory – str
image_files – obj:Path
tex_prepend_path – str
tex_horizontal_shift – verschuiving naar links
bovenschrift – boolean Voeg caption bovenaan figuren
internetnl_domain_analyse.utils module
- internetnl_domain_analyse.utils.add_derived_variables(tables, variables)[source]
Add the variables we defined in the settings files which do not exist yet, but are defined with an eval statement
- Parameters:
tables – pd.DataFrame original table of variables
variables – pd.DataFrame properties of variables
- Returns:
pd.DataFame
- internetnl_domain_analyse.utils.add_missing_groups(all_stats, group_by, group_by_original, missing_groups)[source]
- internetnl_domain_analyse.utils.clean_all_suffix(dataframe, suffix_key, variables)[source]
Hier gaan we de suffixen selecteren die we gedefinieerd hebben.
- Parameters:
dataframe – dataframe met tabellen, waaronder een kolom met website extensies
suffix_key – de naam van de kolom met website extensies
variables – dataframe met variable informatie. Moet minimaal een variabele
zijn (gelijk aan de suffix_key hebben waarin de categorieën gedefinieerd)
- Returns:
dataframe
- internetnl_domain_analyse.utils.dump_data_frame_as_sqlite(dataframe, file_name)[source]
Dump data als sqlite, maar zorg dat je duplicates eruit haalt
- internetnl_domain_analyse.utils.get_all_clean_urls(urls, show_progress=False, cache_directory=None)[source]
- internetnl_domain_analyse.utils.get_option_mask(question_df, variables, question_type, valid_options=None)[source]
get the mask to filter the positive options from a question
- internetnl_domain_analyse.utils.get_windows_or_linux_value(value)[source]
Pas de waarde aan als deze in een dict gegeven is met een windows en linux veld
- internetnl_domain_analyse.utils.impose_variable_defaults(variables, module_info: dict | None = None, module_key: str | None = None)[source]
Impose default values to the variables data frame
- Parameters:
variables (pd.DataFrame) – Dataframe with the initial variables
module_info (pd.DataFrame) – Dataframe with information per module
module_key (str) – Key of the module in the dataframe
- Returns:
Filled dataframe
- Return type:
pd.DataFrame